Out of memory without a hit on the GPU

jpainam · July 30, 2018, 9:07am

Hi!
I can monitor my GPU and I can see no hit on my GPU. But I get an out of memory error.
I even set the batch size to 1. I’m feeding two images of size (224,224); (448,448).

Runing of a GPU Quatro K1200

RuntimeError: cuda runtime error (2) : out of memory at c:\users\administrator\download
s\new-builder\win-wheel\pytorch\aten\src\thc\generic/THCStorage.cu:58

Is that a normal behavior?
Thank you

ptrblck · July 30, 2018, 9:45am

What kind of model architecture are you using?
The Quatro K1200 has if I recall correctly 4GB of GPU RAM.
Could you try to check the memory usage with torch.cuda.memory_allocated()?

Deeply · July 30, 2018, 2:46pm

Another way to investigate this issue is resize the images to 32x32 and 64x64 just for debugging, and try to see how much memory this reduced size is consuming. Then regardless if you find the error or not, if the image size you are using is affecting your memory, maybe you can resize the images a bit, not necessary to 32x32 or 64x64, but to some other values that can fit your job to the GPU. You can also run ’ watch -n 2 nvidia-smi’ from the terminal to watch the GPU.

jpainam · August 1, 2018, 2:39am

Thanks guys, reducing the size of the image helps me understand it was due to the memory size.

Beside, i moved to more robust GPUs and want to use both GPU( 0 and 1). But only end up running one GPU.
torch.cuda.set_device(0), torch.cuda.set_device(1) didn’t help as well as os.environ["CUDA_VISIBLE_DEVICES"] = "0,1"
What can i do to use both GPUs?

ptrblck · August 1, 2018, 9:18am

You could use nn.DataParallel to split your batch onto your GPUs.
Here is a good tutorial to get you started.