# Dimension of a scalar-by-vector derivative?

I have a question about the dimension of the gradient produced by backward() for some very simple case. Suppose we have a dimension 2 vector z that is inner producted with a one vector to form the loss l. So clearly l = z^T 1 = 1^T z. It seems that the usual convention is to always have dl/dz to be the row vector (cf.https://mathinsight.org/derivative_matrix). However, the gradient dimension produced by backward() depends on whether z is a column vector or row vector. Maybe there is some deeper reason behind this design?

``````import numpy as np
import torch

dtype = torch.FloatTensor

# z is column vector
l = Variable(torch.ones(2, 1).t().type(dtype)).mm(z)
l.backward()

# z is row vector
l = z.mm(Variable(torch.ones(2, 1).type(dtype)))
l.backward()
``````

The reason I ask is because this has implication on the dimension of grad_variables to supply to backward, if there is a chain of operations. For example, if we have the following equations
x = [[5, 1], [1, 5]] z
l = 1^T x
Then dl/dz should be [6 6].
To compute this gradient using the chain rule, it seems we should use dl/dx dx/dz, where dx/dz = [[5, 1], [1, 5]]. Then it seems natural to have dl/dx to be [1 1] in order for the computation to be feasible.
However, to compute this gradient using x.backward(grad_variables = vec), we need vec to be [1; 1], instead of [1 1].

Thanks!

`z.grad` always matches the shape of `z` so that each element of `z.grad` is the gradient with respect to the corresponding element of `z`.

This makes it easy to apply updates to `z`. Optimizers all do something like this

``````p.data += -learning_rate * p.grad.data
``````

I havenâ€™t thought through the implications of the chain rule, but if you want to do this

``````myvar.backward(grads)
``````

then `grads` should have the same shape as `myvar`.

When `myvar` has only one element and if you omit `grads` pytorch assumes that you meant to pass in `Variable(torch.ones(myvar.size()))`

Thank you for your explanation! That seems a very reasonable rationale!