Hi, I am new to pytorch and this is what I want to do:

I have two models that are related to one another: modelA and modelB. I want to get three separate gradients. I am able to get the first two without any issues, but I am not sure how I can get the third one.

1. differentiating the loss of modelA wrt modelA.parameters

2. differentiating the loss of modelB wrt modelB.parameters
you need to make sure that `lossB` is a scalar (i.e. a single number) so make sure you sum over your batch dimension as well. If you want per-sample gradients, this can be done via the use of hooks (but gets quite messy to implement)
Also, if `lossB` doesn’t depend on the parameters of `modelA` the gradients will be zero by definition. So, check they depend on each other!