Using loss terms with output from intermediary layers

I am trying to use loss terms with the output of intermediary layers, but I get an error that “it cannot compute the gradients with respect to labels”. To be more explicit:

Say you have an architecture like

layer1 = self.layer1(input)
layer1 = F.relu(layer1)

layer2 = self.layer2(input)
layer2 = F.relu(layer2)

layer3 = self.layer3(input)
layer3 = F.relu(layer3)

And I would want to use a loss term like

criterion = nn.MSELoss()
loss_term = criterion(layer2, layer1) 

And I get an error mentioned above: “cannot compute gradients with respect to labels. Either mention requires_gradients = False or set the variable as volatile”. (The error message is approximate since I don’t have pytorch and my code here to quickly reproduce it. I need to implement something like the above snippet, can anyone please help?

nn.MSELoss expects the second input to be the “target”. In your case layer1 is a Variable which requires its gradients to be computed (according to your snippet) and hence the error.

You can compute your loss like this:

loss = torch.pow(layer2 - layer1, 2).mean()

Is it possible to implement CrossEntropy like this (equivalent of BCELoss) ? I know it’s a sum over plog(q) but not sure how to implement it by myself. If it works, this could be a solution to my problem (I need both MSE and BCE loss). Thanks for your answer.