Focal loss performs worse than cross-entropy-loss in clasification

Take a look at the original paper for more insight.
For focal loss:

" In practice α may be set by inverse class frequency or treated as a hyperparameter to set by cross validation"

Also, your problem is not highly imbalance. You can use weighted cross entropy and get a good performance.

1 Like