Hello,
When I ran my model on multiple GUPs, and ran into out of memory error. So I half the batch_size, and again ran into the same error (after a longer while).
So I checked the GPU memory usage with nivida-smi, and have two questions:
Here is the output of nivida-smi:
| 0 33446 C python 9446MiB |
| 1 33446 C python 5973MiB |
| 2 33446 C python 5973MiB |
| 3 33446 C python 5945MiB |
±----------------------------------------------------------------------------+
-
The memory on GPU_0 is roughly twice used than the other three. Is it nature that the first GPU is more used?
-
The memory usage increase slowly as the program goes, which eventually cause the out-memory-error. How can I solve the problem?
BTW, I installed pytorch using the latest Dockfile.
Thanks!