RuntimeError: cuda runtime error (2) : out of memory

Error seems the loss is too big? I don’t get why will out of memory.

I have changed the code to make the generator inside the model, and just use one backward(), does this combination result in this problem?

THCudaCheck FAIL file=/b/wheel/pytorch-src/torch/lib/THC/generic/THCStorage.cu line=66 error=2 : out of memory
Traceback (most recent call last):
File “main.py”, line 241, in
main()
File “main.py”, line 238, in main
train.train(model, optim, criterion, trainData, validData, testData, opt)
File “/home/zeng/code/tb-seq2seq/train.py”, line 263, in train
loss.backward()
File “/home/zeng/envs/pytorch_0.1.12_py27/local/lib/python2.7/site-packages/torch/autograd/variable.py”, line 146, in backward
self.execution_engine.run_backward((self,), (gradient,), retain_variables)
File "/home/zeng/envs/pytorch_0.1.12_py27/local/lib/python2.7/site-packages/torch/nn/functions/thnn/auto.py", line 46, in backward
grad_input = grad_output.new().resize_as
(input).zero
()
RuntimeError: cuda runtime error (2) : out of memory at /b/wheel/pytorch-src/torch/lib/THC/generic/THCStorage.cu:66

https://github.com/pytorch/pytorch/issues/958 it seems the same error. Isn’t it? Do you save it now? and how ?

Hi,
The error is that during the backward pass, when it tries to allocate memory to store the gradients and perform computations, there is not enough. You should try reducing the batch size.

2 Likes

hi, is not the reason of large number of parameters? I have met the same issue

Hi,

Yes having more parameters makes your model more memory hungry.
If you still want to be able to use the same network, a solution would be to reduce the batch size, this way the intermediary computations will be smaller and will use less memory.

get it. Thank you very much!