'ignore_index' for nn.CrossEntropyLoss() doesn't seem to work

MANSUM · October 18, 2017, 4:32pm

I am trying to use ignore_index, which is a newly introduced keyword parameter for nn.CrossEntropyLoss(). I updated pytorch from source code and it is up to date. (bleeding edge version)
The following is my code:

criterion = nn.CrossEntropyLoss()
for epoch in range(self.numEpoch):
    for batch in self.train_loader:
        user_idx = Variable(batch['user_idx']).cuda()
        item_vecs = Variable(batch['item_vecs'].float()).cuda()
        optimizer.zero_grad()
            
        pred = model(user_idx, item_vecs)
        loss = criterion(pred, item_vecs, ignore_index=-1) # I get the error here!!

        loss.backward()
        optimizer.step()

I get the following error message when I call criterion(pred, item_vecs, ignore_index=-1)

TypeError: "forward() got an unexpected keyword argument 'ignore_index'"

I am tracking the source code, and I realized that the function forward() is from class CrossEntropyLoss() in torch/nn/modules/loss.py

The initializer has the parameter ignore_index. (Refer to line 515 of torch/nn/modules/loss.py)

Does anyone have any idea why this doesn’t work?

richard · October 18, 2017, 5:30pm

http://pytorch.org/docs/master/nn.html#torch.nn.CrossEntropyLoss

ignore_index should be specified when you construct the criterion function, ie,
criterion = nn.CrossEntropyLoss(ignore_index = -1)

And then you can use it elsewhere:
criterion(output, target)

MANSUM · October 19, 2017, 5:01am

I see! Thanks a lot!

shahabty · February 17, 2018, 10:23pm

Hello,
whenever I set ignore_index to -1 or 255 I face an error:
line 1054, in nll_loss
return torch._C._nn.nll_loss2d(input, target, weight, size_average, ignore_index, reduce)
RuntimeError: cuda runtime error (59) : device-side assert triggered at /opt/conda/conda-bld/pytorch_1518238409320/work/torch/lib/THCUNN/generic/SpatialClassNLLCriterion.cu:131

However, I need to set ignore_index to 255 or -1.

11185 · June 27, 2018, 5:35pm

Have you solved this problem?

royboy · June 27, 2018, 11:26pm

Are you running into this issue? I can’t seem to reproduce it, can you provide some code or the inputs that this is failing on?