Hi,
I am wondering how can I do the things below efficiently.
I have one model M and it takes probably 7GB GPU memory. And I have 20 different sets of parameters for that model M. For the same batch of input, how can I get the gradients w.r.t this batch of input under different model parameters?