I have trained a Unet-esque model on 512x512 images. Training and evaluation perform without error.
However, I would like to evaluate the model on a much larger image, ~15,000x40,000.
I have 24 GB of total GPU memory. After loading the model’s state dictionary and sending to the gpu, the gpu memory usage is
torch.cuda.memory_reserved(0)/1e9
>> 0.24
After sending the “bigimage” to the gpu, the total GPU usage is ~7.5 GB. Which is expected given the 3 channel float32
image.
Then, I try to evaluate the model as follows:
with torch.inference_mode(True):
imt = torch.transforms.ToTensor(im)[None]
imc = imt.to(device)
out = model(imc)
Which throws the following OOM error:
RuntimeError: CUDA out of memory. Tried to allocate 148.56 GiB (GPU 0; 23.68 GiB total capacity; 7.07 GiB already allocated; 14.36 GiB free; 7.08 GiB reserved in total by PyTorch)
My question is, why is so much memory being allocated while within inference_mode
?