When I am calculating the integrated gradients, I ran out of GPU memory. It has to do with autograd or something, but I am not sure how to caculate a couple of integrated gradients with different steps and don’t run out of memory. I do the IG calculation within with torch.no_grad(): with no different behaviour
I am using A100 16G GPU
Would you care to elaborate more? The attribution method works on the model, input test example, and target. How batch size come into play? and why I am running out of memory on a low level?
Hi,
Integrated gradient computes gradients of multiple points between the baseline and input (the number of points corresponds to the n_steps parameter). The default is 50 steps.
You might run out memory because a batch of 50 points (if you have not changes the n_steps parameter) is too much for your GPU memory.
If so, you can either reduce the number of steps or use the parameter internal_batch_size.