Cuda out of memory & increasing memory usage

(Rango Hu) #1


When I ran my model on multiple GUPs, and ran into out of memory error. So I half the batch_size, and again ran into the same error (after a longer while).

So I checked the GPU memory usage with nivida-smi, and have two questions:
Here is the output of nivida-smi:

| 0 33446 C python 9446MiB |
| 1 33446 C python 5973MiB |
| 2 33446 C python 5973MiB |
| 3 33446 C python 5945MiB |

  1. The memory on GPU_0 is roughly twice used than the other three. Is it nature that the first GPU is more used?

  2. The memory usage increase slowly as the program goes, which eventually cause the out-memory-error. How can I solve the problem?

BTW, I installed pytorch using the latest Dockfile.



Hi Dear RangoHU,

I face the same problem, do you have any idea why the memory increasing during training?



My out of memory problem has been solved. Please check

(Rango Hu) #4

Hi thanks, I checked this thread, but didn’t help in my case.