Memory allocation

OfficeUnlimited · July 12, 2023, 10:35pm

Hello, I am totally new to pytorch, so forgive my French … I have a stack trace showing this:

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 446.00 MiB (GPU 0; 14.54 GiB total capacity; 753.56 MiB already allocated; 246.56 MiB free; 930.00 MiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

My GPU has 14.54GB, so where does this go wrong? I must say I am using the app in a docker container, are there any known limitation there maybe?

Any help/info highly appreciated! Thanks!

Ronald

ptrblck · July 13, 2023, 12:34am

Check if any other application is using memory and could cause the OOM issue.

OfficeUnlimited · July 13, 2023, 12:12pm

Hi Ptrblck,
Yes there was, killed it solved it. But suppose you have 2 containers trying to share the GPU capabilities, how do you manage the congestion? Thanks!
Ronald

OfficeUnlimited · July 13, 2023, 12:11pm

Tx and yes, there was, but assume we have two containers rtying to share the CPU, how do you cope with congestion in respect to the available GPU resources?

Ciao!

ptrblck · July 13, 2023, 9:47pm

You could limit the available device memory for each process via torch.cuda.set_per_process_memory_fraction(fraction, device=None).

OfficeUnlimited · July 14, 2023, 7:35am

Hmmz, tx, but would that mean that you need to kinda statically divide the memory over the processes and do the assignment per container/process? Looks like the only alternative. That renders the containerization less flexible…

ptrblck · July 14, 2023, 9:39am

Yes, you would define the limit per process. How else would you handle it? You could just use the default and allow all processes to allocate the entire memory but would then run into the original issue.

OfficeUnlimited · July 14, 2023, 11:01am

Got it, they are unaware of each other. Tx man!

OfficeUnlimited · July 14, 2023, 11:47am

last question, when the python is finished and waiting for new work, can I in the meantime ‘release’ the GPU? I am now in a waiting loop waiting for user input, so in this period I 'd rathther free up the gpu. Is there some statement for that? Thxnks!

ptrblck · July 14, 2023, 7:17pm

Yes, you could del all objects which are not needed anymore and clear the cache via torch.cuda.empty_cache(). This will return the cached memory to the OS and will allow other processes to allocate it. Only the CUDA Context and referenced tensors/objects will still allocate and use memory.

OfficeUnlimited · July 14, 2023, 8:57pm

Hi, and still tx for your help, highly appreciated, I m really new to this. Suppose I put the empty_cache() in the finaly clause of my python, should that be sufficient? All objects are del/disposed/destroyed by design at the end of the finaly block, right? Or do I need to be really explicit about which objects to del? Does python not have a garbage collection orso like in .net?
Tx!

ptrblck · July 14, 2023, 10:13pm

Python uses function scoping and will delete objects once you return from the function. In case you’ve created tensors in such a function they will be deleted and the CUDA memory will be returned to the cache. You can then free this cache via torch.cuda.empty_cache().
In case you are creating objects in the global scope you would explicitly need to delete them before calling empty_cache(). The same applies for returned objects from any function which are still referenced.

Yes, Python uses a garbage collector.