Integrated Gradients for RNNs


I’m trying to implement Integrated Gradients (explainability method) for my seq2seq NMT model since there are no public implementations available. For that, I would be required to compute the gradient of my output w.r.t. my input, which would in turn require me to set requires_grad = True for my input variables. Consequently, I would need my input to be of dtype = torch.float (or else an error is thrown), but then I would not be able to use the inputs as inputs to the encoder/decoder embedding layers since those require a dtype = torch.long input. Any suggestions on how to go about this?



You won’t be able to get gradients wrt to the input of the embedding layer I’m afraid. Since, as you pointed out, they are not of contiguous dtype.
You might want to do use that technique on the output of the embedding layer instead?

Thanks for your reply, you are right!