Why does `requires_grad_` change value of `F.mse_loss`?

Consider this code:

>>> import torch
>>> from torch.nn import functional as F
>>> x = torch.ones(3)
>>> xr = 2 * x
>>> F.mse_loss(xr, x)
tensor(1.)

Which is fine. But if I call requires_grad_, I get a different answer:

>>> x = torch.ones(3).requires_grad_(True)
>>> xr = 2 * x
>>> F.mse_loss(xr, x)
tensor(3., grad_fn=<SumBackward0>)

This is not what I would expect. Shouldn’t calling requires_grad_ only change the value of x after calling backward, i.e. after automatic differentiation?

What version are you using? It does not happen for me on version 1.0.0.

Interesting. I am using 0.4.1, but can also confirm this behavior does not happen in 1.0.0.