Gradients not flowing

I see okay. One last question: does having an imbalanced problem affect the gradients that are learnt? I’m using CrossEntropyLoss(), with like only 10-20 labels per classification, the other samples are -100, the ignore index.

It will change the shape of the loss function for sure. But I don’t think it should be a very large issue.

I see okay. Because my samples are heavily skewed, with like only 10% having labels. Most of my other labels are the ignore indexes. I’ll have a look to see if the net can learn better with other functions. Thanks a lot for explaining!