I have a network that I’m trying to backprop the losses to **only for the initial layers**, and I do not want to update the gradients of the later layers nearer to where the losses are computed. However, I’m currently doing this via setting `requires_grad_(False)`

for the model’s parameters, and this completely stops the gradient computation.

So if some earlier layer depends on a later layer for gradients, it wouldn’t be able to do so. I’m hoping to decouple the gradient computation from the gradient update, but it seems that by using `backward()`

for the losses, the autograd package does the computing and update together. Is there a way for me to compute the gradients but **not** update the gradients?

I’m currently thinking of setting the LR to 0 and non-zero for some other layers/tensors - is there a more straightforward way?

If I was to set `requires_grad_`

to be true for a **tensor**, is there a need to zero gradients after the update to prevent gradient accumulation? Currently there seems to be no support for this.