This works fine:
model = LSTM(num_features, output_length)
sample_input = torch.rand(128,10,7)
output = model(sample_input)
output.shape
But this gives an error:
model = LSTM(num_features, output_length).cuda()
sample_input = torch.rand(128,10,7).cuda()
output = model(sample_input)
output.shape
Returns RuntimeError: Input and hidden tensors are not at the same device, found input tensor at cuda:0 and hidden tensor at cpu
I’m using the latest version of Pytorch. Is this an intended behaviour ? I’m clearly specifying the location of both input tensor and model. But still the hidden state tensor is not on GPU. Why is this happening ? I’m using a server with multiple GPUs but I’ve specified this already:
os.environ["CUDA_VISIBLE_DEVICES"]="0"
torch.cuda.set_device(0)