You could try to use the method FairSeq is using.
Depending on the OOM error, you could empty the cache or just skip the batch, if it’s too large.
Would that work for you?
You could try to use the method FairSeq is using.
Depending on the OOM error, you could empty the cache or just skip the batch, if it’s too large.
Would that work for you?