Gradient memory Issue

ESLAM_ZAHER · July 16, 2023, 2:23pm

Hi,

When I am calculating the integrated gradients, I ran out of GPU memory. It has to do with autograd or something, but I am not sure how to caculate a couple of integrated gradients with different steps and don’t run out of memory. I do the IG calculation within with torch.no_grad(): with no different behaviour
I am using A100 16G GPU

Z4K · July 16, 2023, 2:56pm

Hi !
You can consider using the internal_batch_size parameter when you call the attribute method.

ESLAM_ZAHER · July 16, 2023, 3:07pm

Would you care to elaborate more? The attribution method works on the model, input test example, and target. How batch size come into play? and why I am running out of memory on a low level?

Z4K · July 16, 2023, 3:15pm

Hi,
Integrated gradient computes gradients of multiple points between the baseline and input (the number of points corresponds to the n_steps parameter). The default is 50 steps.
You might run out memory because a batch of 50 points (if you have not changes the n_steps parameter) is too much for your GPU memory.
If so, you can either reduce the number of steps or use the parameter internal_batch_size.