Thanks, it was:
import torch
torch.cuda.max_memory_allocated()
This can help me figure out the max batch size I can use on a model, hopefully. But I wonder if something similar is present in PyTorch already.
However, I am not sure if this thing will also count the memory in the garbage collector that can be free after gc.collect()
.
Maybe this is called cache.