Hi,
The problem is that _,pred = torch.max(preds.view(1,-1).type(torch.cuda.FloatTensor),1)
is not a differentiable operation. So you cannot have gradients flowing back from pred to preds.
In general, if you have to set the requires_grad=True
flag by hand on an intermediary value it means that an operation before was not differentiable and so you won’t get the gradients you want!
You can look around (google or other post on this forum) for differentiable functions to replace .max_indices()
but they are all quite heuristic.