tensor.grad=None for tensors on GPU

I find a mistake about tensors on GPU.

import torch
device = torch.device(“cuda”)
x = torch.rand(3, requires_grad=True)
x = x.to(device)
m = x.mean()
m.backward()
print(x)
print(m)
print(x.grad)

These codes output

tensor([0.6155, 0.2922, 0.6875], device=‘cuda:0’, grad_fn=<CopyBackwards>)
tensor(0.5317, device=‘cuda:0’, grad_fn=<MeanBackward1>)
None

x.grad=None, I wonder why?

Hi,

When you do x = x.to(device), you change the tensor x.
Only the tensor that you created with requires_grad=True will have gradients computed.
The x that you check is an intermediary Tensor and so will not have gradients.
Note that you can do: x = torch.rand(3, device=device, requires_grad=True).

13 Likes