GPU Memory in Eval vs Training

Hi there, I’m new to Pytorch and struggling to understand GPU memory management. I have a model which, during training, takes up slightly more memory than my GPU can handle - so I’ve gone ahead and trained it on an AWS server with more virtual memory.

I’m wondering whether I will need the same amount of memory to evaluate the model on the GPU, or whether I could possibly evaluate without running into issues. I know in both cases you have to load the whole model into the GPU, but during evaluation you don’t need to calculate gradients so I’d guess it takes less memory - but by how much, I’m not sure. What do you think?

Your assumption is correct. During evaluation you don’t need to calculate the gradients, which would take memory, and also don’t need to store intermediate activations, which would be needed to calculate the gradients.

To lower the memory usage and not store these intermediates, you should wrap your evaluation code into a with torch.no_grad() block as seen here:

model = MyModel().to('cuda')
with torch.no_grad():
    output = model(data)
1 Like