Consider this code:
>>> import torch
>>> from torch.nn import functional as F
>>> x = torch.ones(3)
>>> xr = 2 * x
>>> F.mse_loss(xr, x)
tensor(1.)
Which is fine. But if I call requires_grad_
, I get a different answer:
>>> x = torch.ones(3).requires_grad_(True)
>>> xr = 2 * x
>>> F.mse_loss(xr, x)
tensor(3., grad_fn=<SumBackward0>)
This is not what I would expect. Shouldn’t calling requires_grad_
only change the value of x
after calling backward
, i.e. after automatic differentiation?