CUDNN Error in backprop for big batch sizes (input is contiguous)

I implemented a combination of MLP, RNN, CNN. With a batch size of 420, everything seems to work fine (aka I dont get any errors). However as soon as I increase the batch (to lets say 840), I receive the following error:

Traceback (most recent call last):
  File "", line 152, in <module>
  File "/home/tbaumgae/.local/lib/python3.5/site-packages/torch/autograd/", line 146, in backward
    self._execution_engine.run_backward((self,), (gradient,), retain_variables)
RuntimeError: CUDNN_STATUS_NOT_SUPPORTED. This error may appear if you passed in a non-contiguous input.

The forward pass seems to work fine. I check all the variables whether they are contiguous and they are. Also my prediction and target for the loss calculation are contiguous and also the returned loss. But then this error occurs when calling backward(). Any ideas why this would occur?