RunTime Error: One of the differentiated Tensors appears to not have been used in the graph

Mohamed_Magdy · March 15, 2021, 11:26pm

Hello,
I’m currently implementing a project for my machine learning course, and i have faced an issue in gradient computation based on additional trainable parameters. I’m trying actually to implement the algorithm

with the help also of a github project that i’m taking as reference.
I’m receving “RuntimeError: One of the differentiated Tensors appears to not have been used in the graph. Set allow_unused=True if this is the desired behavior” during step 10 “Computation of gradients w.r.t epsilon”
The piece of my code that implements from step 1 to step 10 is as follows

criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(net.parameters(), lr=gamma, momentum=rho)
for epoch in range(T):
    running_loss = 0.0
    for idx, (train_x, train_label) in enumerate(trainloader):
      net.train()
      # initialize a dummy network for the meta learning of the weights
      meta_net = LeNet().to(device)
      meta_net_optimizer = torch.optim.SGD(meta_net.parameters(), lr=gamma, momentum=rho)
      # Step 4 - 5 forward pass to compute the initial weighted loss
      # Forward propagation
      outputs = meta_net(train_x)
      # Error evaluation
      train_label = train_label.squeeze()
      loss = criterion(outputs,train_label.long())
      eps = to_var(torch.zeros(loss.size()))
      l_f_meta = torch.sum(loss * eps)
      meta_net.zero_grad()
      # Line 6 perform a parameter update
      grads = torch.autograd.grad(l_f_meta, meta_net.parameters(), create_graph=True)
      meta_net_optimizer.step()
      # Line 8 - 10 2nd forward pass and getting the gradients with respect to epsilon
      y_g_hat = meta_net(test_data_t)
      l_g_meta = criterion(y_g_hat,test_labels_t.long())  
      grad_eps = torch.autograd.grad(l_g_meta, eps, only_inputs=True) # >> Step 10

Could you advise please why i’m receiving that? Supposedly, epsilon is a tensor. Issue is in
grad_eps = torch.autograd.grad(l_g_meta, eps, only_inputs=True). My target is to

Dwight_Foster · March 16, 2021, 12:39am

In torch.zeros have you tried setting requires grad to True like this:

      eps = to_var(torch.zeros(loss.size(), requires_grad=True))

Mohamed_Magdy · March 16, 2021, 6:26am

Same issue. Is there a way to determine the differentiable variables? Because i get that eps cannot be found but no idea why

Dwight_Foster · March 16, 2021, 2:20pm

No because eps is not in the graph the derivative will always be none. I do not know a way to get around this. You could try looking at these functions to see if you can use them to calculate the correct gradients but I cannot find a why to add eps to the graph.

Liangbo_Ning · June 9, 2022, 2:27am

Maybe you will get inspired from here : (GitHub - TinfoilHat0/Learning-to-Reweight-Examples-for-Robust-Deep-Learning-with-PyTorch-Higher: An implementation of the paper "Learning to Reweight Examples for Robust Deep Learning" from ICML 2018 with PyTorch and Higher.)
“Higher” package can be used to compute the gradient in this case.