RuntimeError: Input tensor is too large. for validation steps of 3D conv backbone

RuntimeError: Input tensor is too large.

I get that error in my model with a 3D conv backbone (X3D with altered layers in the end) during training but only for Validation steps.
With a non 3D backbone train and validation steps both work fine.

The complete error report is:

Traceback (most recent call last):
  File "train.py", line 166, in <module>
    main(opt)
  File "train.py", line 142, in main
    log_dict_val, results, *cmat = trainer.val(epoch, val_loader)
  File "/home/nimnuske/fairmot-exploration-room/src/lib/trains/base_trainer.py", line 144, in val
    ret, results, *cmat = self.run_epoch('val', epoch, data_loader)
  File "/home/nimnuske/fairmot-exploration-room/src/lib/trains/base_trainer.py", line 79, in run_epoch
    output, loss, loss_stats = model_with_loss(batch)
  File "/home/nimnuske/.conda/envs/mcmot/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/nimnuske/fairmot-exploration-room/src/lib/trains/base_trainer.py", line 19, in forward
    outputs = self.model(batch['input'])
  File "/home/nimnuske/.conda/envs/mcmot/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/nimnuske/fairmot-exploration-room/src/lib/models/networks/x3d_backbone.py", line 1072, in forward
    x = self.model(x.permute(0,2,1,3,4).float())
  File "/home/nimnuske/.conda/envs/mcmot/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/nimnuske/.conda/envs/mcmot/lib/python3.8/site-packages/pytorchvideo/models/net.py", line 43, in forward
    x = self.blocks[idx](x)
  File "/home/nimnuske/.conda/envs/mcmot/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/nimnuske/.conda/envs/mcmot/lib/python3.8/site-packages/pytorchvideo/models/resnet.py", line 1393, in forward
    x = res_block(x)
  File "/home/nimnuske/.conda/envs/mcmot/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/nimnuske/.conda/envs/mcmot/lib/python3.8/site-packages/pytorchvideo/models/resnet.py", line 1180, in forward
    x = self.branch_fusion(shortcut, self.branch2(x))
  File "/home/nimnuske/.conda/envs/mcmot/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/nimnuske/.conda/envs/mcmot/lib/python3.8/site-packages/pytorchvideo/models/resnet.py", line 1349, in forward
    x = self.conv_b(x)
  File "/home/nimnuske/.conda/envs/mcmot/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/nimnuske/.conda/envs/mcmot/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 613, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "/home/nimnuske/.conda/envs/mcmot/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 608, in _conv_forward
    return F.conv3d(
RuntimeError: Input tensor is too large.

Train and Val input tensor have the same shapes and are on the same device:
train torch.Size([60, 8, 3, 608, 1088]) cuda:0
val torch.Size([60, 8, 3, 608, 1088]) cuda:0

I found the source of my problem, but the error still seems weird to me.
The problem was that I had a clause in my code that for validation I didn’t access DataParallel but tried to put the whole batch on one device where it did not fit.
So the error I would expect would be: cuda_out_of_memory.

Does anyone know the reason why I got ‘Input tensor is too large’ instead?

You are not necessarily running out of memory, but into a 32bit indexing limitation raised here.