I have realized that the negative log likelihood function in pytorch has a reduction argument which can be used instead of my method below but I still cannot justify why this doesn’t work when it looks like it should.
In this case y
was a vector with all of the integer class labels in it. I verified that the answers were very close to what I would get from the functional.nll_loss
in torch, but the model failed to learn in this case so there must have been some problem which didn’t cause an error, but IDK what it is. Can anyone see why this didn’t work?
loss = -torch.log(F.softmax(logits, dim=1)[:, y])
loss.mean().backward()
optimizer.step()