 A very strange thing ,god, my tensor couldn't generate grad when tensor.cuda()

Code first ;

import torch
x = torch.ones((2,10), requires_grad=True)
z = torch.ones((1,10), requires_grad=True)
y = (4 * x * x + 10 * z).mean()
y.backward()

it’s output

As we can see, if we write code like above without tensor.cuda() , tensor x,z has a grad here;

But if we change tensor x,y to cuda like below with tensor.cuda() ;

import torch
x = torch.ones((2,10), requires_grad=True).cuda()
z = torch.ones((1,10), requires_grad=True).cuda()
y = (4 * x * x + 10 * z).mean()
y.backward()

If we write like this , the grad will be None , what a strange thing , what happend here? I’m not new to pytorch , I have learned and used pytorch for a long time , but it’s the first time that I meet such a thing like this .
Could anyone helps me about it ? Thanks for your reply

Sorry guys ,I’m very stupid , I read the basic concept again ,and found that if I use x= x.cuda(),then x will become not a leaf , backward() will not keep gradient for x , so I should add x.retain_grad() after x=x.cuda() , and then , x will get a grad here.

You can avoid most of these problems by wrapping it in an object that subclasses nn.Module. There’s a reason most neural networks subclass nn.Module. Try this version

import torch
from torch import nn
class foo(nn.Module):
def __init__(self):
super(foo,self).__init__()
self.x = nn.Parameter(torch.ones(2,10)) #requires_grad automatically True
self.z = nn.Parameter(torch.ones(1,10))
bar=foo().cuda()
y = (4 * bar.x * bar.x + 10 * bar.z).mean()
y.backward()