Inceptionv3 out of GPU memory error, even with memory management

whydoineedausername · November 20, 2019, 8:14am

inceptionv3.to(device) # 'cuda'
inceptionv3.eval()

inceptionv3.requires_grad_ = False

output = []

for batch in loader:
    torch.cuda.empty_cache()
    out = inceptionv3(batch[0].to(device))
    output.append(out.detach().cpu().numpy())
    del out

Somehow, this still winds up with a memory error. Adjusting batch size makes no difference.

Let me add that I’m using the facenet modification of inception v3, and there might be some error internal to the way the model was modified for the facenet data.

ptrblck · November 21, 2019, 4:10am

inceptionv3 itsemf doesn’t have the requires_grad attribute, which is an attribute for tensors.
Setting it to False will just create a new attribute without any effect.

If you want to run inference only, you should wrap the code in a with torch.no_grad() block:

with torch.no_grad():
    out = inceptionv3(batch)

This will save some memory by avoiding to store the intermediate activations, which would be needed for the backward pass.

Calling torch.cuda.empty_cache shouldn’t give you more memory, as it will only clear the cache, which could be reused otherwise.

whydoineedausername · November 21, 2019, 4:12am

Thank you. Is there a tutorial where these sorts of memory management details are made available that you know of? I haven’t seen anything too advanced or useful with pytorch tutorials for this.

ptrblck · November 21, 2019, 4:27am

I’m not sure, if we have a tutorial specifically targeting “memory management”.
What would you like to see in such a tutorial from a user’s perspective?

whydoineedausername · November 21, 2019, 5:35am

Just common pitfalls, explanations about what parts of a model and tensors and parameters and scheduling and so on get loaded to the GPU and can thus accumulate, how to avoid accumulation, etc.