PyTorch Lightning RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same

sfeldmann · September 29, 2023, 10:19am

Hello,

I am trying to implement a model using PyTorch Lightning. During the sanity check validation loop (and also in the training loop if I disable the sanity check) I get the following error:

RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same

Full Traceback:

Traceback (most recent call last):
  File "/home/sfeldmann/PycharmProjects/dins_lightning/main.py", line 28, in <module>
    trainer.fit(model=model, datamodule=dm)
  File "/home/sfeldmann/anaconda3/envs/dins_lightning/lib/python3.11/site-packages/lightning/pytorch/trainer/trainer.py", line 532, in fit
    call._call_and_handle_interrupt(
  File "/home/sfeldmann/anaconda3/envs/dins_lightning/lib/python3.11/site-packages/lightning/pytorch/trainer/call.py", line 43, in _call_and_handle_interrupt
    return trainer_fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sfeldmann/anaconda3/envs/dins_lightning/lib/python3.11/site-packages/lightning/pytorch/trainer/trainer.py", line 571, in _fit_impl
    self._run(model, ckpt_path=ckpt_path)
  File "/home/sfeldmann/anaconda3/envs/dins_lightning/lib/python3.11/site-packages/lightning/pytorch/trainer/trainer.py", line 980, in _run
    results = self._run_stage()
              ^^^^^^^^^^^^^^^^^
  File "/home/sfeldmann/anaconda3/envs/dins_lightning/lib/python3.11/site-packages/lightning/pytorch/trainer/trainer.py", line 1021, in _run_stage
    self._run_sanity_check()
  File "/home/sfeldmann/anaconda3/envs/dins_lightning/lib/python3.11/site-packages/lightning/pytorch/trainer/trainer.py", line 1050, in _run_sanity_check
    val_loop.run()
  File "/home/sfeldmann/anaconda3/envs/dins_lightning/lib/python3.11/site-packages/lightning/pytorch/loops/utilities.py", line 181, in _decorator
    return loop_run(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sfeldmann/anaconda3/envs/dins_lightning/lib/python3.11/site-packages/lightning/pytorch/loops/evaluation_loop.py", line 115, in run
    self._evaluation_step(batch, batch_idx, dataloader_idx)
  File "/home/sfeldmann/anaconda3/envs/dins_lightning/lib/python3.11/site-packages/lightning/pytorch/loops/evaluation_loop.py", line 376, in _evaluation_step
    output = call._call_strategy_hook(trainer, hook_name, *step_kwargs.values())
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sfeldmann/anaconda3/envs/dins_lightning/lib/python3.11/site-packages/lightning/pytorch/trainer/call.py", line 294, in _call_strategy_hook
    output = fn(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^
  File "/home/sfeldmann/anaconda3/envs/dins_lightning/lib/python3.11/site-packages/lightning/pytorch/strategies/strategy.py", line 393, in validation_step
    return self.model.validation_step(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sfeldmann/PycharmProjects/dins_lightning/model.py", line 159, in validation_step
    predictions = self.forward(samples)  # [N, 2, D, H, W]
                  ^^^^^^^^^^^^^^^^^^^^^
  File "/home/sfeldmann/PycharmProjects/dins_lightning/model.py", line 105, in forward
    x1 = self.down1(x)
         ^^^^^^^^^^^^^
  File "/home/sfeldmann/PycharmProjects/dins_lightning/model.py", line 44, in __call__
    x = self.relu(self.norm1(self.conv1(x)))
                             ^^^^^^^^^^^^^
  File "/home/sfeldmann/anaconda3/envs/dins_lightning/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sfeldmann/anaconda3/envs/dins_lightning/lib/python3.11/site-packages/torch/nn/modules/conv.py", line 613, in forward
    return self._conv_forward(input, self.weight, self.bias)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sfeldmann/anaconda3/envs/dins_lightning/lib/python3.11/site-packages/torch/nn/modules/conv.py", line 608, in _conv_forward
    return F.conv3d(
           ^^^^^^^^^
RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same

This suggests that my input tensor was moved to the GPU, but my model is still on the CPU. In the PyTorch Lightning documentation (LightningModule — PyTorch-Lightning 0.10.0 documentation) they tell me to not use .cuda() or .to(device) manually, as Lightning will handle this for me.

If I explicitely move the model to gpu in the validation step (which I override in the pl.LightningModule) before calling .forward() like this:

self.cuda()
self.forward(x)

I still get input type torch.cuda.FloatTensor and weight type torch.FloatTensor. When I tried (just for testing purposes) to move the input tensor to the cpu my error is exactly the opposite, where the input type is torch.FloatTensor and weight type torch.cuda.FloatTensor.

Printing self.device before the self.forward(x) call also shows cuda:0.

I have checked my data preprocessing and augmentation steps, in which I never call .to(device) and when I change the type I don’t use .to(torch.float32) but .type(torch.float32) as well.

ptrblck · September 29, 2023, 2:23pm

Are you seeing the same error without using lightning?

sfeldmann · September 29, 2023, 4:03pm

Before implementing everything in PyTorch without lightning I tried to reproduce the error with minimal code, which helped me find my mistake, so thank you!

The problem was that I defined layers of my network outside of the pl.LightningModule in a class that did not inherit from nn.Module. That’s why the model parameters of these layers were not moved to the GPU by Lightning.

ptrblck · September 29, 2023, 5:40pm

Thanks for sharing the update!