How to reset GPU status after OOM raising?

say I’m testing some codes and monitoring the GPU memory allocated with torch.cuda.max_memory_allocated()
Edit: I delete the misleading toy example.

The situation is that I run a benchmark over several different configurations and log their execution time, GPU memory footprint, etc.
And the issue is when one of the configuration leads to CUDA OOM, the benchmark of next configuration always produces error memory stats (I know it’s wrong by exchanging the execution order). How can we avoid such thing?
A typical error message like


T=150   U=20    V=5000  N=1     time=2.70       memory=243
T=150   U=20    V=5000  N=16    time=50.05      memory=3842
T=150   U=20    V=5000  N=32    time=128.20     memory=7684
T=150   U=20    V=5000  N=64    time=276.37     memory=15360
T=150   U=20    V=5000  N=128   error=CUDA out of memory. ...

T=1500  U=300   V=50    N=1     time=16.21      memory=23044   <- Error occurs, this should be much smaller
T=1500  U=300   V=50    N=16    time=78.82      memory=5763
T=1500  U=300   V=50    N=32    time=209.08     memory=11520
T=1500  U=300   V=50    N=64    time=398.86     memory=23044

The repo can be found here maxwellzh/warp-rnnt: CUDA-Warp RNN-Transducer (
in pytorch_binding/

It is not clear to me what you want to do.

What do you mean by “resetting error states”? By “error memory stats” do you mean “incorrect memory stats”?

Are you saying that after an OOM error, torch.cuda.max_memory_allocated() returns a wrong value? If so, how do you know that this is a wrong value?

Hi, @gphilip I edited topic, it should be more clear now.