Now I find what’s going on. It’s because there is one
unsqueeze_() left somewhere which is inplace. For some reason
unsqueeze_() won’t affect
torch.sum() but it does affect
I didn’t use inplace ops any where obviously (except things like
unsqueeze_(), which I tried to remove but had no effect.)
To understand what’s going on, I randomly chose some tensors during the calculation and here is a ridiculous thing:
I chose a variable, called x, and my code has:
loss = nn.CrossEntropyLoss()(x, torch.LongTensor())
loss.backward(). It fails and claims gradient computation has been modified by an inplace operation.
I changed it to
loss = torch.sum(x)
loss.backward() works fine.
How is this even possible? if somehow the grad of x contains some hidden problem, shouldn’t the second line break as well?
So to resolve this, I manually wrote the nn.CrossEntropyLoss()… (It’s not hard. Per https://pytorch.org/docs/master/nn.html?highlight=nllloss#crossentropyloss) it’s just a one line code. Now it’s not complaining.
Any suggestions / comments? Thanks quite a lot.