I am trying to take derivatives of a network with respect to its arguments and I am confused about a result I have been getting. For this example my network, model
, takes 3 inputs and has two outputs. Here is the code I am running:
ipt = Variable((batch[0]),requires_grad=True).view(batchSize, batch[0].shape[1])
x = ipt[:,0].unsqueeze(1)
y = ipt[:,1].unsqueeze(1)
z = ipt[:,2].unsqueeze(1)
out = model(ipt).cuda()
A = out[:,0].unsqueeze(1)
B = out[:,1].unsqueeze(1)
A_z = grad(A,z,torch.ones((batch_size, 1),requires_grad=True).cuda(), create_graph=True)[0]
A_y = grad(A,y,torch.ones((batch_size, 1),requires_grad=True).cuda(), create_graph=True)[0]
A_x = grad(A,x,torch.ones((batch_size, 1),requires_grad=True).cuda(), create_graph=True)[0]
I get the following error on the A_z
line:
RuntimeError: One of the differentiated Tensors appears to not have been used in the graph. Set allow_unused=True if this is the desired behavior.
This confuses me since this is the only derivative computed so far so it seems like A_z
isn’t being added to the graph?