How to create output equivalent to tf.gradients()?

I am trying to implement a loss function in Pytorch which requires me to get the gradient of the logit of a discriminator of a GAN as such:

There is an implementation of this in tensorflow already which uses the tf.gradients function:

   # -----------------------------------------------------------------------------------
    #     JS-Regularizer
    # -----------------------------------------------------------------------------------
    def Discriminator_Regularizer(D1_logits, D1_arg, D2_logits, D2_arg):
        with tf.name_scope('disc_reg'):
            D1 = tf.nn.sigmoid(D1_logits)
            D2 = tf.nn.sigmoid(D2_logits)
            grad_D1_logits = tf.gradients(D1_logits, D1_arg)[0]
            grad_D2_logits = tf.gradients(D2_logits, D2_arg)[0]
            grad_D1_logits_norm = tf.norm( tf.reshape(grad_D1_logits, [BATCH_SIZE//len(DEVICES),-1]) , axis=1)
            grad_D2_logits_norm = tf.norm( tf.reshape(grad_D2_logits, [BATCH_SIZE//len(DEVICES),-1]) , axis=1)
            #set keep_dims=True/False such that grad_D_logits_norm.shape == D.shape
            print('grad_D1_logits_norm.shape {} != D1.shape {}'.format(grad_D1_logits_norm.shape, D1.shape))
            print('grad_D2_logits_norm.shape {} != D2.shape {}'.format(grad_D2_logits_norm.shape, D2.shape))
            assert grad_D1_logits_norm.shape == D1.shape
            assert grad_D2_logits_norm.shape == D2.shape
            reg_D1 = tf.multiply(tf.square(1.0-D1), tf.square(grad_D1_logits_norm))
            reg_D2 = tf.multiply(tf.square(D2), tf.square(grad_D2_logits_norm))
            disc_regularizer = tf.reduce_mean(reg_D1 + reg_D2)
     return disc_regularizer

Does anyone know how to do the equivalent using autograd? Is this functionality possible in pytorch? It would be greatly appreciated.

My understanding is that tf.gradients(ys,xs) makes the symbolic derivatives of the sum of ys wrt to x. How is this different from autograd.grad()?


1 Like

I think you’re looking for autograd.grad. It computes the gradients of output wrt some inputs.

Hi Richard,

Thanks for your response. Yes I figured this would be the implementation choice, but when I do try to use it, I am told:

RuntimeError: grad can be implicitly created only for scalar outputs

Whereas the version in tf works fine. Do you know what this error might be suggesting?

The shape of the (outputs, wrt inputs) is ([512,1], [512,3]). They are both variables. Is this the issue?

To compute gradients (which implies that the output of the function is scalar),

The outputs you pass to autograd.grad should be either of the following:

  • A Variable wrapping a Tensor of size (1,) (something like a scalar)
  • An arbitrary tuple of Variables. In this case, you should specify a grad_output= for autograd.grad that has the same shape as the arbitrary tuple of Variables.
1 Like

if you have found the answer to your solution can you share the code please?

thank you !

An example1:

from torch.autograd import Variable
xx = Variable(torch.rand(1), requires_grad=True)
cc = 2*xx
torch.autograd.grad(cc, xx)

example 2:

xx = Variable(torch.rand(1), requires_grad=True)
cc = 2*xx
gg =3*cc
torch.autograd.grad(gg, xx)

Would you please explain the second case? For example here.