I am training a language model. I want to ignore a certain class index (-1
) in computing loss only at test time, but consider all class indices in the loss function during training.
So at test time, I map all instances of the to-be-ignored vocab items to that index (-1
), and then pass ignore_index=-1
to torch.nn.CrossEntropyLoss
. But that gives me this error:
/opt/conda/conda-bld/pytorch_1565287148058/work/aten/src/THCUNN/ClassNLLCriterion.cu:105: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *,
Dtype *, Dtype *, long *, Dtype *, int, int, int, int, long) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [4,0,0] Assertion `t >=
0 && t < n_classes` failed.
Is this because I am training and testing my model with a different number of classes? How can I fix this? Thank you for your help!