Bug of multiple losses on intermediate layers

xu_shen · October 29, 2017, 8:05am

I am trying to build an auto-encoder model, but when I add losses on intermediate layers, pytorch raises the following AssertionError:
AssertionError: nn criterions don’t compute the gradient w.r.t. targets - please mark these variables as volatile or not requiring gradients.

My code is listed as follows:
import torch
fc1 = nn.Linear(50, 20)
fc2 = nn.Linear(20, 10)
fc2_r = nn.Linear(10, 20)
fc1_r = nn.Linear(20, 50)
mse = nn.MSELoss()
x = Variable(torch.randn(100, 50))
l1 = fc1(x)
l2 = fc2(l1)
l1_r = fc2_r(l2)
x_r = fc1_r(l1_r)
loss0 = mse(x_r, x)
loss1 = mse(l1_r, l1)

Any suggestions?

richard · October 30, 2017, 12:33am

The error means that the mse function’s second argument (mse(output, target)) target should be a Variable with requires_grad = False. I’m not sure why you’re computing mse(x_r, x) when x_r and x are related to each other, but to get rid of the error you can try something like:
mse(x_r, Variable(x.data, requires_grad = False))

Federico_Pala · October 30, 2017, 3:22am

You are missing non linearities between layers! (Edit, I misunderstood what you where going)

xu_shen · November 1, 2017, 5:52am

I see, it works, thanks for your kindly reply.

xu_shen · November 1, 2017, 5:56am

but another problem is, in loss1, the gradients of l1_r and l1 are both required to update parameters, how to solve this case? maybe the follow way(not very elegant):
loss1 = mse(l1_r, Variable(l1.data, requires_grad=False)) + mse(l1, Variable(l1_r.data, requires_grad=False))

xu_shen · November 1, 2017, 5:57am

still thanks for your attention~

richard · November 1, 2017, 1:28pm

In that case I’d recommend that you write your own loss function. Something like
loss = (( l1_r - l1) * (l1_r - l1)).mean()
so that you can backprop through both l1_r and l1.

xu_shen · November 7, 2017, 2:02am

Wow, it works! thanks very much.