def calculate_gradient_penalty(self, real_sens, fake_sens):
eta = torch.FloatTensor(opt.BATCH_SIZE, opt.MAX_SEQ_LEN).uniform_(0,1)
if self.cuda:
eta = eta.cuda()
# CUDALongTensor--->LongTensor--->FloatTensor--->CUDAFloatTensor
new_real_sens = real_sens.cpu().type(torch.FloatTensor).cuda()
new_fake_sens = fake_sens.cpu().type(torch.FloatTensor).cuda()
interpolated = eta * new_real_sens + ((1 - eta) * new_fake_sens) # CUDAFloatTensor
# calculate probability of interpolated examples
prob_interpolated = self.D(interpolated.cpu().type(torch.LongTensor).cuda()) # batch_size, 1 (LongTensor)
# define it to calculate gradient
interpolated = Variable(interpolated, requires_grad = True)
# calculate gradients of probabilities with respect to examples
gradients = torch.autograd.grad(outputs=prob_interpolated, inputs=interpolated,
grad_outputs=torch.ones(prob_interpolated.size()).cuda(),
create_graph=True, retain_graph=True)
grad_penalty = ((gradients.norm(2, dim=1)-1) ** 2).mean() * self.lambda_term
return grad_penalty
When I write the calculate_gradient_penalty function of WGAN-GP, but I got the following problem:
“RuntimeError: One of the differentiated Tensors appears to not have been used in the graph. Set allow_unused=True if this is the desired behavior.”
I search the similar problem that has exists in other man’s work, the cause is that they use ‘.detach()’ or ‘.data’ or ‘.numpy()’ somewhere in the chain. After investigation, I don’t have this problem. So I want to know what the causes of problem I am facing?
From documentation allow_unused (bool, optional) – If False, specifying inputs that were not used when computing outputs (and therefore their grad is always zero) is an error. Defaults to False.
Thanks for your answer. This method I has tried and I failed before I put forward my topic. When I set allow_unused = True as the error says, the gradient I get is None, so it is also confused me.
How did you check this ? there is any tools or methods to verify computational graph ?
I have the same problem, but it’s not easy for me to check since my model is little complicated.
@ptrblck@albanD apologies for the direct ping. Unsure who to ping.
I am facing this issue too but I know this should not be happening. All weights should be being used.
Is it possible to improve the error message and display the name of the weights causing this issue? It would help me to even start where to debug this.
My model is a simple 5CNN and it passes mini-imagenet data. There shouldn’t be this issue – but would love to get better messages
RuntimeError: One of the differentiated Tensors appears to not have been used in the graph. Set allow_unused=True if this is the desired behavior.
You can’t print which error is problematic directly no.
But you can pass the “allow_unused=True” flag to remove the error and then check which outputs are “None”. These are the ones that were not connected to the graph.
it should be trivial from the pytorch dev side. If you PyTorch already knows some params are NOT in the forward pass since it failed to do backprop – then instead of giving me an existential quantifier tell me which one concretely is not being used. Then I can debug it. Otherwise, where do I even start in a large model?
it should be trivial from the pytorch dev side. If you PyTorch already knows some params are NOT in the forward pass since it failed to do backprop – then instead of giving me an existential quantifier tell me which one concretely is not being used. Then I can debug it. Otherwise, where do I even start in a large model?
Hi, I encountered a similar problem when I reproduced the gradient penalty of WGAN-gp. It appears that changing the view/shape of the tensor, say X, that is passed to inputs arguments causes this problem. Don’t know if this is true for the original post and whether this is desired behavior. This can be replicated by following codes:
import torch
from torch.autograd import grad
a=torch.randn((3,4),requires_grad=True)
b=a @ (torch.arange(4).float()+1).reshape(4,1)
c=b.sum()
gs=grad(c,a.view(-1),torch.ones_like(c),True,True)[0]
# ^ will fail with RuntimeError:...Tensors appears to not have been used in the graph. Set allow_unused=True...
gs=grad(c,a,torch.ones_like(c),True,True)[0] # success
I think the failure in your code is expected as you are passing a view of a (with a valid .grad_fn) as the input to torch.autograd.grad. a.view(-1) is thus also not a leaf tensor anymore.
@ptrblck@albanD how do I set allow_unused=True globally? I just have a normal .backward() so I never actually use the torch.autograd.grad(outputs, inputs, grad_outputs=None, retain_graph=None, create_graph=False, only_inputs=True, allow_unused=False, is_grads_batched=False) directly myself as the docs suggest.
How do I set this true? (hoping this will remove the fact that it’s likely that the params for my vit/transformer aren’t being used when I do a forward pass and that’s fine)