Hello
I want to add to what @albanD said
Indeed, transferring the newly created random tensor to GPU inside the class init()
function results in this warning. Moreover, if you want to see the gradients attached to that random tensor, you will get None
. Solution to that problem is to transfer your random tensor to GPU only during the forward pass, NOT in the class definition itself.
To replicate the problem you can run this code (let us call this problematic code
)
import torch
import torch.nn as nn
mdev = torch.device("cuda:0")
torch.manual_seed(123)
class mclass(torch.nn.Module):
def __init__(self):
super(mclass, self).__init__()
self.fc1 = nn.Linear(10, 5)
self.a1 = nn.Sigmoid()
self.fc2 = nn.Linear(5,1)
self.a2 = nn.Sigmoid()
self.mw = torch.rand(5,5, requires_grad=True).to(mdev)
def forward(self,b):
b = self.a1(self.fc1(b))
b = b @ self.mw
b = self.a2(self.fc2(b))
return b
# ----- RUN -----
tmodel = mclass().to(mdev)
a = torch.round(torch.rand(4,1)).to(mdev)
b = torch.rand(4,10).to(mdev)
CE = torch.nn.BCELoss()
pred = tmodel.forward(b)
loss = CE(pred,a)
print(loss)
loss.backward()
print('Gradients attached on my random tensor are:',tmodel.mw.grad)
You will get the the above user warning with None
gradients.
The solution is to change your class definition like the following:
class mclass(torch.nn.Module):
def __init__(self):
super(mclass, self).__init__()
self.fc1 = nn.Linear(10, 5)
self.a1 = nn.Sigmoid()
self.fc2 = nn.Linear(5,1)
self.a2 = nn.Sigmoid()
self.mw = torch.rand(5,5, requires_grad=True) # <-----
def forward(self,b):
b = self.a1(self.fc1(b))
b = b @ self.mw.to(mdev) # <-----
b = self.a2(self.fc2(b))
return b
Now you can see the attached gradients with no warning.
However, if you are using CPU instead of GPU then you will see the attached gradients and no user warning even on the problematic code
. Change mdev = torch.device("cuda:0")
to mdev = torch.device("cpu")
in the code and run; it will run normally. I do not know why that is happening.