Categorical cross entropy

#Categorical cross entropy update

As the confidence of the model increases, that is, pi → 1, modulating factor will tend to 0, thus down-weighting the loss value for well-classified examples.As a result, it behaves as a Cross-Entropy loss. In the case of the misclassified sample, the pi is small, making the modulating factor approximately or very close to 1.The alpha parameter replaces the actual label term in the Cross-Entropy equation.ĭown weighting increases with an increase in γ, Image Source : Focal Loss Research Paper How gamma parameter works? Alpha could be the inverse class frequency or a hyper-parameter that is determined by cross-validation. As a result, Cross-Entropy loss fails to pay more attention to hard examples.īalanced Cross-Entropy loss adds a weighting factor to each class, which is represented by the Greek letter alpha. Hard examples are those in which the model repeatedly makes huge errors, whereas easy examples are those which are easily classified. Fails to distinguish between hard and easy examples.Balanced Cross-Entropy loss handles this problem.

#Categorical cross entropy update

The majority class examples will dominate the loss function and gradient descent, causing the weights to update in the direction of the model becoming more confident in predicting the majority class while putting less emphasis on the minority classes. Class imbalance inherits bias in the process.Cases where Cross-Entropy loss performs badly

Graphs for log(x) in red and -log(x) in blue, Image Source : Author.