Data and model both sent to CUDA

Once we send the model to CUDA, should any input data to the model be in CUDA as well?
What I found was that data in tensor did not need to be in CUDA as an input for a model which was sent to CUDA before.
Is there any golden rule for this issue?

Yes, the input data also has to be on the GPU and PyTorch will fail otherwise:

model = nn.Linear(10, 10).cuda()
x = torch.randn(1, 10)

out = model(x)
# RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument mat1 in method wrapper_addmm)

x = x.cuda()
out = model(x)
print(out.shape)
# torch.Size([1, 10])

I don’t know why it was not failing in your use case so could you describe a bit more what exactly you were running?

So I defined a predict function in nn class which returns a scalar for class prediction

def predict(self, list):
      list = torch.LongTensor(list)
      return torch.argmax(self(list)).detach().numpy()

I called it in training loop after I sent my initialized model to CUDA. I thought self( ) would require something in CUDA but in this case I did not send the longtensor list to CUDA. And it worked even without warning

Could you post a minimal, executable code snippet reproducing this behavior, please?