Hello,

I’m trying to get the gradient of input but without calculating the gradient of model parameters.

More specifically, there is an input A and this goes into a model M and then output P comes out.

With this output P and the input A, I will make another input B (for example, elementwise multiplication bla bla) and this goes into the model M and then I get final output Q.

Now, the loss function is gonna be formulated with this output Q and given label.

To make things clear, I will do

input A -> model M -> output P (process #1)

C(a result of doing something with an output P and input A) -> model M -> output Q (process #2)

and then I will do

loss = criterion(output Q, label)

loss.backward()

but the point is

- the gradient of loss wrt the model M’s parameter in process #2

should not be saved in model.something.weight.grad and model.something.bias.grad - but the gradient of loss wrt C in process #2 should be calculated and saved in C.grad.

Backpropagation flows through C and goes to process #1.

Here in process #1, on the other hand,

- the gradient of loss wrt the model M’s parameter should be calculated and saved.

As far as I know, .detach() makes all the superior parameters(parameters that are closer to the input) not calculate their gradient.

Is there any way to do this?