Isn't setting default ignore_index=-100 a bit dangerous?


I noticed that for the cross_entropy_loss function there is a ignore_index keyword that, by default, is set to -100. Isn’t this kind of dangerous if people are trying to train a classifier that has more than 100 classes (e.g. classifiers on the imagenet data)? They will have one of their classes randomly ignored won’t they? I tried to trace back this keyword myself through the source code to see where it’s actually used, but I eventually get to a torch._C._nn module that I’m not able to find, so I’m not able to confirm if this is a potential issue or not (seems like it would be hard to make this not a potential issue).

(Sebastian J Mielke) #2

I too wonder about this, looking at the THNN code ( ; no idea about what happens in other backends) it seems like it literally compares the two signed integers (current class and ignored_index) in C, so any negative number should just never match.
Of course, I could be wrong, so an official answer or note in the docs about negative numbers not begin interpreted as in Python but as in C might be useful :slight_smile: