How to free GPU memory when OOM error occurs?

You could try to use the method FairSeq is using.
Depending on the OOM error, you could empty the cache or just skip the batch, if it’s too large.
Would that work for you?