Suppose we have a composite loss like loss=loss1+loss2, we need to compute gradients for loss1 for whole network, but just the last few layers for loss2. That is, how to stop gradients for the first few layers when calculating loss2, but keep gradients for loss1?