Runtimeerror: cuda error: out of memory pytorch

I’m getting runtimeerror: cuda error: out of memory pytorch with batchsize over 4

on NVIDIA P100 GPU with 16GB memory.

test_loader = DataLoader(dataset=test_dataset, batch_size=BATCH_SIZE,
shuffle=False, num_workers=8, pin_memory=True)

# initialize the ground truth and output tensor
gt = torch.FloatTensor()
gt = gt.cuda()
pred = torch.FloatTensor()
pred = pred.cuda()

# switch to evaluate mode

for i, (inp, target) in enumerate(test_loader):
    target = target.cuda()
    gt =, target), 0)
    bs, n_crops, c, h, w = inp.size()
    input_var = torch.autograd.Variable(inp.view(-1, c, h, w).cuda(), volatile=True)
    output = model(input_var)        <=== error thrown here.

I opened the gif files I am training and closed them also, but it didn’t workout

image =
image1 = image.convert(‘RGB’)

Could you lower the batch size to 1 and check, if your code works?
Also, could you check the output of nvidia-smi and see, how much memory your model uses and if other processes are using the GPU?

If your memory footprint is really large, you might try to use checkpointing to trade compute for memory.

It works with batch size 3. But, for my experiment, I need batch size 8. I am processing x-ray images. Let me try checkpointing.