GPU memory that model uses

Thanks, it was:

import torch
torch.cuda.max_memory_allocated()

This can help me figure out the max batch size I can use on a model, hopefully. But I wonder if something similar is present in PyTorch already.

However, I am not sure if this thing will also count the memory in the garbage collector that can be free after gc.collect().

Maybe this is called cache.