RuntimeError: Input tensor is too large. for validation steps of 3D conv backbone

I found the source of my problem, but the error still seems weird to me.
The problem was that I had a clause in my code that for validation I didn’t access DataParallel but tried to put the whole batch on one device where it did not fit.
So the error I would expect would be: cuda_out_of_memory.

Does anyone know the reason why I got ‘Input tensor is too large’ instead?