Why torch calls empty_cache?

Hi. I’m studying about torch’s autograd engine.

I found calling empty_cache while doing loss.backward.

This behavior only occurs on the first.

I know that Pytorch’s memory allocator uses cached pool.
So i think
There is no reason to call an empty cache().

But Pytorch calls that.

What’s the reason?

Internally empty_cache is called if an out of memory error was detected as an attempt to recover from it. Also, if you are using cudnn with benchmark=True, different algorithms will be profiled in the first iteration, which use a different amount of memory for their workspaces. To recover this memory and allow other processes to use it, empty_cache will also be used.

Thank you for the reply.

I have more questions.

As you say, i’m using cudnn and benchmark=True.

I know that if I set benchmark=true, it will automatically find the right algorithm for my hardware.

But i’m not sure what the algorithms does…

Could you please explain if you are not busy?

cudnnFind will use different kernel implementations for the current workload, e.g. the current conv layer. You can imagine it as iterating over a conv implementation using matrix multiplications, an FFT, Winograd, etc. Each of these different algorithms (for the same convolution with the given input shapes, padding, dilation, etc.) will be executed and the time will be captured. Once this is done, the fastest would be selected in the best case.

Thank you so much!

It was very helpful!! have a nice day!