Gradient memory Issue


When I am calculating the integrated gradients, I ran out of GPU memory. It has to do with autograd or something, but I am not sure how to caculate a couple of integrated gradients with different steps and don’t run out of memory. I do the IG calculation within with torch.no_grad(): with no different behaviour
I am using A100 16G GPU

Hi !
You can consider using the internal_batch_size parameter when you call the attribute method.

Would you care to elaborate more? The attribution method works on the model, input test example, and target. How batch size come into play? and why I am running out of memory on a low level?

Integrated gradient computes gradients of multiple points between the baseline and input (the number of points corresponds to the n_steps parameter). The default is 50 steps.
You might run out memory because a batch of 50 points (if you have not changes the n_steps parameter) is too much for your GPU memory.
If so, you can either reduce the number of steps or use the parameter internal_batch_size.