Same batch size in evaluation trigger "RuntimeError: CUDA out of memory."

Hi, guys,
I am learning about DeepLabV3+ model these days.
And I meet a strange phenomenon that using the same batch size in evaluation trigger “RuntimeError: CUDA out of memory.”, which is normal in training.
But the inference speed seems quite faster than the training.

Any idea or answer will be appreciated!

Are you seeing the OOM error after a few iterations or how were you able to see the validation step being faster?
Since Python uses function scoping, you might want to wrap some code in specific functions so that they’ll be cleared as explained here.

1 Like

Hi, I saw this OOM error as soon as the program started.

What is the largest batch size you can run in training and evaluation without running into the OOM error?

1 Like

The largest batch size in training and evaluation is 4 and 2.

Could you post a code snippet to reproduce this issue?
Instead of your real data, you could initialize the input and target using random tensors, so that we could debug this issue.

1 Like

I think should check the problem myself first,
and I want to know if there is way to show the GPU memory usage of a PyTorch model?

You could use torch.cuda.memory_allocated(), torch.cuda.memory_cached() etc. in your script to check the memory. Also, nvidia-smi will give you the overall memory usage (including the CUDA context).

1 Like

Thanks for your answer.
I will try it next time.