Is it recommemded to use
torch.cuda.empty_cache() before every training for every batch? Is it going to affect the performance ?
My model leads to OOM error for some batches becasue of its dynamic nature. So I am trying to
emtpy_cache and garbage collect before every batch of data.
You won’t be able to use more memory, as you are just deleting the cache.
I haven’t profiled it, but I would assume the performance might be affected in a negative way, since the memory has to be reallocated in order to be used again.
If you are seeing an OOM error, you could try to lower e.g. the batch size.
If that’s not an option, have a look at checkpointing to trade compute for memory.