The slicing operation of tensor affects the normal autograd

Firstly, I put the slicing operation out of “with torch.no_grad()” scope. The loss of the classifier does not decrease and the prediction is wrong.


Then, I move the slicing operation into “with torch.no_grad()”. And the classifier behaves normally.

Why does the classifier run abnormally in the first case?

It might matter if your data requires grad, but otherwise it shouldn’t. Do you have a small repro demonstrating the issue?