Memory increases from first to second forward pass during evaluation

When I iterate through my evaluation code while watching memory usage on GPU using nvidia-smi, I can strangely see that the first batch gets processed with 7 GB of GPU memory. Upon forwarding the second batch the memory usage on GPU grows to 14 GB and stays constant afterwards.

I have my model in evaluation mode.

test_loader = DataLoader(test_reader, batchsize, num_workers=0,
                         shuffle=False, pin_memory=True, drop_last=False)
for itr, (image_batch) in enumerate(test_loader):

        # Forward Pass
        image_batch = image_batch.cuda()
        prediction_batch = model.forward(image_batch)

Any idea why this is so?
I assume the model doesn’t require gradients in evaluation mode. Or does the model still store gradients?

you should use the context manager with torch.no_grad()
There will be gradients otherwise