I noticed that for the cross_entropy_loss function there is a ignore_index keyword that, by default, is set to -100. Isn’t this kind of dangerous if people are trying to train a classifier that has more than 100 classes (e.g. classifiers on the imagenet data)? They will have one of their classes randomly ignored won’t they? I tried to trace back this keyword myself through the source code to see where it’s actually used, but I eventually get to a torch._C._nn module that I’m not able to find, so I’m not able to confirm if this is a potential issue or not (seems like it would be hard to make this not a potential issue).