Propagating different losses

nandof · August 8, 2019, 3:18pm

Consider a network (i) -> (h) -> (o) where i, h, o, are input, hidden, and output layers, respectively.
I would like to associate a loss Lh and a loss Lo to layers h and o, respectively. However, I wish to backpropagate Lh only from layer h, backward, and loss Lo from layer o backward.
Could anyone please point me through the right direction to do so?
Many thanks

ptrblck · August 9, 2019, 12:04am

Here is a small example:

class MyModel(nn.Module):
    def __init__(self):
        super(MyModel, self).__init__()
        self.fc1 = nn.Linear(10, 10)
        self.fc2 = nn.Linear(10, 10)
        self.act = nn.ReLU()
        
    def forward(self, x):
        x1 = self.act(self.fc1(x))
        x = self.fc2(x1)
        return x, x1
    

# Create model and execute forward pass
criterion = nn.MSELoss()
model = MyModel()
x = torch.randn(1, 10)
o, h = model(x)

# Calculate losses
loss_o = criterion(o, torch.rand_like(o))
loss_h = criterion(h, torch.rand_like(h))

# Backward loss_h and keep intermediate activations
loss_h.backward(retain_graph=True)

# Check that self.fc2 grads are empty
for name, param in model.named_parameters():
    print(name, param.grad)

# Backward loss_o
loss_o.backward()

# Gradient are accumulated in self.fc1 and newly populated in self.fc2
grads2 = []
for name, param in model.named_parameters():
    print(name, param.grad)
    grads2.append(param.grad.clone())

You would also get the same results if you sum both losses and call .backward() on the result tensor.