Training on GPU fails

I know I should transfer both models and data on gpus to train a model. I did, the model is like this:
class Net(nn.Module):
def init(self):
super(Net, self).init()
self.conv1 = nn.Conv2d(3, 16, 3, stride=2)
self.pool = nn.MaxPool2d(2, 2)
self.bn1 = nn.BatchNorm2d(16)
self.conv2 = nn.Conv2d(16, 32, 3, stride=2)
self.bn2 = nn.BatchNorm2d(32)
self.conv3 = nn.Conv2d(32, 32, 3, stride=2)
self.bn3 = nn.BatchNorm2d(32)
self.head = nn.Linear(148, 100)
self.fc = nn.Linear(100,50)
self.fc2 = nn.Linear(50,1)

def forward(self, x):
    x = self.pool(torchF.relu(self.bn1(self.conv1(x))))
    x = self.pool(torchF.relu(self.bn2(self.conv2(x))))
    x = self.pool(torchF.relu(self.bn3(self.conv3(x))))
    x = x.view(x.size(0), -1)
    x_size1 = x.size()[1]
    self.head = nn.Linear(x_size1, 100)
    self.head.double()
    x = self.fc2(self.fc(self.head(x)))
    return x

then I define input as:
x = Variable(torch.from_numpy(np.random.randn(1,3,128,128)).cuda())
net = Net()
net.double()
out = net(x)

which returns the following errors:
TypeError: addmm_ received an invalid combination of arguments - got (int, int, torch.cuda.DoubleTensor, torch.DoubleTensor), but expected one of:

  • (torch.cuda.DoubleTensor mat1, torch.cuda.DoubleTensor mat2)
  • (torch.cuda.sparse.DoubleTensor mat1, torch.cuda.DoubleTensor mat2)
  • (float beta, torch.cuda.DoubleTensor mat1, torch.cuda.DoubleTensor mat2)
  • (float alpha, torch.cuda.DoubleTensor mat1, torch.cuda.DoubleTensor mat2)
  • (float beta, torch.cuda.sparse.DoubleTensor mat1, torch.cuda.DoubleTensor mat2)
  • (float alpha, torch.cuda.sparse.DoubleTensor mat1, torch.cuda.DoubleTensor mat2)
  • (float beta, float alpha, torch.cuda.DoubleTensor mat1, torch.cuda.DoubleTensor mat2)
    didn’t match because some of the arguments have invalid types: (int, int, torch.cuda.DoubleTensor, torch.DoubleTensor)
  • (float beta, float alpha, torch.cuda.sparse.DoubleTensor mat1, torch.cuda.DoubleTensor mat2)
    didn’t match because some of the arguments have invalid types: (int, int, torch.cuda.DoubleTensor, torch.DoubleTensor)

one of your inputs to Linear is on the CPU, whether it’s the input or the weight.

I see the bug:
You should do

net = Net()
net.cuda().double()
2 Likes

Thank you! It works!