Bug of multiple losses on intermediate layers

(Xu Shen) #1

I am trying to build an auto-encoder model, but when I add losses on intermediate layers, pytorch raises the following AssertionError:
AssertionError: nn criterions don’t compute the gradient w.r.t. targets - please mark these variables as volatile or not requiring gradients.

My code is listed as follows:
import torch
fc1 = nn.Linear(50, 20)
fc2 = nn.Linear(20, 10)
fc2_r = nn.Linear(10, 20)
fc1_r = nn.Linear(20, 50)
mse = nn.MSELoss()
x = Variable(torch.randn(100, 50))
l1 = fc1(x)
l2 = fc2(l1)
l1_r = fc2_r(l2)
x_r = fc1_r(l1_r)
loss0 = mse(x_r, x)
loss1 = mse(l1_r, l1)

Any suggestions?


The error means that the mse function’s second argument (mse(output, target)) target should be a Variable with requires_grad = False. I’m not sure why you’re computing mse(x_r, x) when x_r and x are related to each other, but to get rid of the error you can try something like:
mse(x_r, Variable(x.data, requires_grad = False))

(Federico Pala) #3

You are missing non linearities between layers! (Edit, I misunderstood what you where going)

(Xu Shen) #4

I see, it works, thanks for your kindly reply.

(Xu Shen) #5

but another problem is, in loss1, the gradients of l1_r and l1 are both required to update parameters, how to solve this case? maybe the follow way(not very elegant):
loss1 = mse(l1_r, Variable(l1.data, requires_grad=False)) + mse(l1, Variable(l1_r.data, requires_grad=False))

(Xu Shen) #6

still thanks for your attention~


In that case I’d recommend that you write your own loss function. Something like
loss = (( l1_r - l1) * (l1_r - l1)).mean()
so that you can backprop through both l1_r and l1.

(Xu Shen) #8

Wow, it works! thanks very much.