I am trying to build an auto-encoder model, but when I add losses on intermediate layers, pytorch raises the following AssertionError:
AssertionError: nn criterions don’t compute the gradient w.r.t. targets - please mark these variables as volatile or not requiring gradients.
My code is listed as follows:
fc1 = nn.Linear(50, 20)
fc2 = nn.Linear(20, 10)
fc2_r = nn.Linear(10, 20)
fc1_r = nn.Linear(20, 50)
mse = nn.MSELoss()
x = Variable(torch.randn(100, 50))
l1 = fc1(x)
l2 = fc2(l1)
l1_r = fc2_r(l2)
x_r = fc1_r(l1_r)
loss0 = mse(x_r, x)
loss1 = mse(l1_r, l1)
The error means that the mse function’s second argument (
mse(output, target)) target should be a Variable with
requires_grad = False. I’m not sure why you’re computing
mse(x_r, x) when
x are related to each other, but to get rid of the error you can try something like:
mse(x_r, Variable(x.data, requires_grad = False))
You are missing non linearities between layers! (Edit, I misunderstood what you where going)
I see, it works, thanks for your kindly reply.
but another problem is, in loss1, the gradients of l1_r and l1 are both required to update parameters, how to solve this case? maybe the follow way(not very elegant):
loss1 = mse(l1_r, Variable(l1.data, requires_grad=False)) + mse(l1, Variable(l1_r.data, requires_grad=False))
still thanks for your attention~
In that case I’d recommend that you write your own loss function. Something like
loss = (( l1_r - l1) * (l1_r - l1)).mean()
so that you can backprop through both l1_r and l1.
Wow, it works! thanks very much.