Hello,
I am trying to implement a model using PyTorch Lightning. During the sanity check validation loop (and also in the training loop if I disable the sanity check) I get the following error:
RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same
Full Traceback:
Traceback (most recent call last):
File "/home/sfeldmann/PycharmProjects/dins_lightning/main.py", line 28, in <module>
trainer.fit(model=model, datamodule=dm)
File "/home/sfeldmann/anaconda3/envs/dins_lightning/lib/python3.11/site-packages/lightning/pytorch/trainer/trainer.py", line 532, in fit
call._call_and_handle_interrupt(
File "/home/sfeldmann/anaconda3/envs/dins_lightning/lib/python3.11/site-packages/lightning/pytorch/trainer/call.py", line 43, in _call_and_handle_interrupt
return trainer_fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sfeldmann/anaconda3/envs/dins_lightning/lib/python3.11/site-packages/lightning/pytorch/trainer/trainer.py", line 571, in _fit_impl
self._run(model, ckpt_path=ckpt_path)
File "/home/sfeldmann/anaconda3/envs/dins_lightning/lib/python3.11/site-packages/lightning/pytorch/trainer/trainer.py", line 980, in _run
results = self._run_stage()
^^^^^^^^^^^^^^^^^
File "/home/sfeldmann/anaconda3/envs/dins_lightning/lib/python3.11/site-packages/lightning/pytorch/trainer/trainer.py", line 1021, in _run_stage
self._run_sanity_check()
File "/home/sfeldmann/anaconda3/envs/dins_lightning/lib/python3.11/site-packages/lightning/pytorch/trainer/trainer.py", line 1050, in _run_sanity_check
val_loop.run()
File "/home/sfeldmann/anaconda3/envs/dins_lightning/lib/python3.11/site-packages/lightning/pytorch/loops/utilities.py", line 181, in _decorator
return loop_run(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sfeldmann/anaconda3/envs/dins_lightning/lib/python3.11/site-packages/lightning/pytorch/loops/evaluation_loop.py", line 115, in run
self._evaluation_step(batch, batch_idx, dataloader_idx)
File "/home/sfeldmann/anaconda3/envs/dins_lightning/lib/python3.11/site-packages/lightning/pytorch/loops/evaluation_loop.py", line 376, in _evaluation_step
output = call._call_strategy_hook(trainer, hook_name, *step_kwargs.values())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sfeldmann/anaconda3/envs/dins_lightning/lib/python3.11/site-packages/lightning/pytorch/trainer/call.py", line 294, in _call_strategy_hook
output = fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "/home/sfeldmann/anaconda3/envs/dins_lightning/lib/python3.11/site-packages/lightning/pytorch/strategies/strategy.py", line 393, in validation_step
return self.model.validation_step(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sfeldmann/PycharmProjects/dins_lightning/model.py", line 159, in validation_step
predictions = self.forward(samples) # [N, 2, D, H, W]
^^^^^^^^^^^^^^^^^^^^^
File "/home/sfeldmann/PycharmProjects/dins_lightning/model.py", line 105, in forward
x1 = self.down1(x)
^^^^^^^^^^^^^
File "/home/sfeldmann/PycharmProjects/dins_lightning/model.py", line 44, in __call__
x = self.relu(self.norm1(self.conv1(x)))
^^^^^^^^^^^^^
File "/home/sfeldmann/anaconda3/envs/dins_lightning/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sfeldmann/anaconda3/envs/dins_lightning/lib/python3.11/site-packages/torch/nn/modules/conv.py", line 613, in forward
return self._conv_forward(input, self.weight, self.bias)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sfeldmann/anaconda3/envs/dins_lightning/lib/python3.11/site-packages/torch/nn/modules/conv.py", line 608, in _conv_forward
return F.conv3d(
^^^^^^^^^
RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same
This suggests that my input tensor was moved to the GPU, but my model is still on the CPU. In the PyTorch Lightning documentation (LightningModule — PyTorch-Lightning 0.10.0 documentation) they tell me to not use .cuda() or .to(device) manually, as Lightning will handle this for me.
If I explicitely move the model to gpu in the validation step (which I override in the pl.LightningModule) before calling .forward() like this:
self.cuda()
self.forward(x)
I still get input type torch.cuda.FloatTensor and weight type torch.FloatTensor. When I tried (just for testing purposes) to move the input tensor to the cpu my error is exactly the opposite, where the input type is torch.FloatTensor and weight type torch.cuda.FloatTensor.
Printing self.device
before the self.forward(x)
call also shows cuda:0
.
I have checked my data preprocessing and augmentation steps, in which I never call .to(device)
and when I change the type I don’t use .to(torch.float32)
but .type(torch.float32)
as well.