Input Type and WeightType mismatch in moving GPU model to CPU

arvindmohan · March 1, 2019, 1:00am

Please consider the code example below. Even though the weights have moved to CPU from a GPU trained model, the input tensor is still stuck to GPU only. I have tried everything possible and yet I am not able to load CPU tensors as input. My GPU RAM is insufficient for inference so I really have to get this on to CPU. Any idea why the model input is not getting ported to CPU while everything else is? Seems a bit misleading. Thanks,

model = torch.load(path)
model = model.cpu()

for param in model.parameters():
  print(param.data.type())
Output: 
torch.FloatTensor
torch.FloatTensor
torch.FloatTensor
torch.FloatTensor

y = model(data.type(torch.FloatTensor))
RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same

MariosOreo · March 1, 2019, 1:40am

Hi,

You could have a try on data.to('cpu') to transform data from gpu to cpu.

arvindmohan · March 1, 2019, 1:58am

Nope, same error unfortunately.

MariosOreo · March 1, 2019, 2:11am

I test your case in an example as follow:

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.fc1 = nn.Linear(10,5)
        self.fc2 = nn.Linear(5,5)
        self.fc3 = nn.Linear(5,1)

    def forward(self, x):
        x = self.fc1(x)
        x = self.fc2(x)
        x = self.fc3(x)
        return x

random_input = torch.randn(10, requires_grad=True)
random_target = torch.randn(1, requires_grad=True)


ckpt = torch.load('model.pth')
net = Net()
net.load_state_dict(ckpt)

for param in net.parameters():
    print(param.type())
# Output: torch.FloatTensor

input = random_input.to('cuda:0')
print(input.type())
# Output: torch.cuda.FloatTensor

output = net(input.type(torch.FloadTensor)

but it works

arvindmohan · March 1, 2019, 2:57am

Everything is fine, including loading the model…except the evaluation. I am still getting the same error.

for param in model.parameters():
  print(param.data.type())

#OUTPUT: torch.FloatTensor
torch.FloatTensor
torch.FloatTensor
torch.FloatTensor

My evaluation is

x = data[:2,::]
x.to('cpu')
#output: 'torch.DoubleTensor'
y = model(x.type(torch.FloatTensor))

so I have really no clue where it is going wrong… I think the model still expects the input tensors to be on GPU, which is confusing since the model has been explicitly loaded on to the CPU. I am using Pytorch 1.0.1

EDIT: I’ve solved the issue…turns out my custom NN was explicitly calling .cuda() for the input tensor. Thanks for all the help @MariosOreo! Definitely gave me some clarity.