Clarification - Using backward() on non-scalars

jsm · November 19, 2017, 2:59pm

Thanks for your intuitive example.
I’m getting a bit lost on the usage of the following line:
y.backward(torch.FloatTensor([[1, 0]]), retain_variables=True)
especially when it comes to the gradient argument.

As @linlin already mentioned, in the provided example the argument gradient=torch.FloatTensor([[1,0]])
is used to acquire information for a column (e.g. the first one in the above scenario) or to define the variable
that we compute the gradient with respect to (as seen from: http://pytorch.org/docs/master/autograd.html#torch.autograd.Variable.backward) ?

If we use it with respect to a given variable why
y.backward(torch.FloatTensor([[1, 0]]), retain_variables=True)
jacobian[:,0] = x.grad.data
x.grad.data.zero_()
y.backward(torch.FloatTensor([[0, 1]]), retain_variables=True)
jacobian[:,1] = x.grad.data

is used instead of

y.backward(x * torch.FloatTensor([[1, 0]]), retain_variables=True)
jacobian[:,0] = x.grad.data
x.grad.data.zero_()
y.backward(x * torch.FloatTensor([[0, 1]]), retain_variables=True)
jacobian[:,1] = x.grad.data

for computing [[dy1/dx1, dy1/dx2], [dy2/dx1, dy2/dx2]] ?

Thank you in advance!