I wanna get input grad in mse_loss with reduction=‘mean’. Regularly, the process is

```
import torch
x = torch.tensor(torch.tensor([1.,2.,3.,4.,5.]))
x.requires_grad_(True)
target = torch.tensor(torch.tensor(5.,4.,3.,2.,1.))
out = torch.nn.functional.mse_loss(x, target, reduction = 'mean')
out.backward(torch.tensor(float(2.)))
print(x.grad)
the outpu is tensor([-3.2, -1.6, 0.0, 1.6, 3.2])
besides, if I exec like this: out.backward(torch.tensor[2,3,4,5,6])), I will get a RuntimeError.
```

if I use mse_loss_backward to get x.grad, the process is

```
import torch
grad_out = torch.tensor(torch.tensor([2,3,4,5,6]))
x = torch.tensor(torch.tensor([1,2,3,4,5]))
target = torch.tensor(torch.tensor(5,4,3,2,1))
out = torch.ops.aten.mse_loss_backward(grad_output, x, target, 1)
print(out)
the out is tensor([-3.2, -2.4, 0.0, 4.0, 9.6])
```

I want to konw what’s the difference between the two methods, and Why can’t they have the same input?