I am training an lstm model and currently I am performing random search.
Initially after each random search I was emptying the cache (
torch.cuda.empty_cache()), however I was getting the OOM error after some number of random searches (usually around 3).
Then I read that in order for the memory to be freed I need to do
del variable first. However, even after that I continued having the same issue. I am tracing the allocated gpu memory (
memory_allocated()) and I can see that after each random search the memory is being freed. Although, when a new random search starts the memory allocated is a bit higher than the previous random search. I don’t think that this is caused by some variable that is not erased. Is there something that I am missing?