Once we send the model to CUDA, should any input data to the model be in CUDA as well?
What I found was that data in tensor did not need to be in CUDA as an input for a model which was sent to CUDA before.
Is there any golden rule for this issue?
Yes, the input data also has to be on the GPU and PyTorch will fail otherwise:
model = nn.Linear(10, 10).cuda()
x = torch.randn(1, 10)
out = model(x)
# RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument mat1 in method wrapper_addmm)
x = x.cuda()
out = model(x)
print(out.shape)
# torch.Size([1, 10])
I don’t know why it was not failing in your use case so could you describe a bit more what exactly you were running?
So I defined a predict function in nn class which returns a scalar for class prediction
def predict(self, list):
list = torch.LongTensor(list)
return torch.argmax(self(list)).detach().numpy()
I called it in training loop after I sent my initialized model to CUDA. I thought self( ) would require something in CUDA but in this case I did not send the longtensor list to CUDA. And it worked even without warning
Could you post a minimal, executable code snippet reproducing this behavior, please?