How to free GPU memory when OOM error occurs?

I’m quite concerned about how to free GPU memory when OOM error occurs. It’s quite easy for Theano, but I don’t know how for Pytorch.

This is a quite serious problem because when OOM occurs in deployment environment, we can’t kill the process and start it again.

You could try to use the method FairSeq is using.
Depending on the OOM error, you could empty the cache or just skip the batch, if it’s too large.
Would that work for you?

Thank you @ptrblck, I’ll try that and report back