I am trying to do a bunch of things that tend to take up memory. Basically I am passing various images to a feature extractor and then passing those encodings to 3 LSTMs for inference. I pass the feature extractors and the 3 LSTMs to cuda before calling the function that takes as input the images and the models.
I apply gc.collect() and torch.cuda.emtpy_cache() at the end of every forward pass. But I still notice that the GPU memory occupied seems to be increasing. At first few images it is at 3.5 GB and remains there and then after some more iterations it is at 6.7GB and it keeps increasing even though the number of images is low. I even put a try except statement to catch the out of memory error and carry on, but it still doesnt clear the GPU memory…
del(gpu_tensor) also does not help, neither does gpu_tensor.cpu()
I try to run the LSTMs in a different GPU by passing it and the tensor to another gpu index but I get a cublas runtime error:library not initialised. Deleting ~/.nv folder during and before runtime did not help.
The code is difficult to understand with abbreviated var names and commented code. But in general, you should use Variable(tensor, volatile=True) when testing. Upon seeing volatile inputs, backward graph won’t be constructed. So it will save a lot of memory. Also, it is worth noting that torch.cuda.empty_cache() is never solution to OOM. It doesn’t increase the amount of free memory you can use.