b25 = torch.tensor([[1.,2.,3.,4.,5.],[6.,7.,8.,9.,10.]], requires_grad= True)
def func2(x):
y = x**2
return y
y = func2(b25)
vector = torch.tensor([[100., 1., 1., 1., 1.],[1., 1., 1., 1., 1.]])
Jocobian = torch.autograd.functional.jacobian(func2, b25, create_graph=False, strict=False)
y.backward(vector)
The Jocobian is a tensor of (2,5,2,5), because y is (2,5), b25 is also (2,5).
Jocobian is the derivatives of dy/db25
According to the Jocobian-vector product, we can multiply Jocobian with vector to get the result that is same as y.backward(vector). But how could tensor of (2,5,2,5) and vector (2,5) to derive a (2,5) final gradient?
Jocobian = tensor([[
[[ 2., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.]],
[[ 0., 4., 0., 0., 0.],
[ 0., 0., 0., 0., 0.]],
[[ 0., 0., 6., 0., 0.],
[ 0., 0., 0., 0., 0.]],
[[ 0., 0., 0., 8., 0.],
[ 0., 0., 0., 0., 0.]],
[[ 0., 0., 0., 0., 10.],
[ 0., 0., 0., 0., 0.]]],
[[[ 0., 0., 0., 0., 0.],
[12., 0., 0., 0., 0.]],
[[ 0., 0., 0., 0., 0.],
[ 0., 14., 0., 0., 0.]],
[[ 0., 0., 0., 0., 0.],
[ 0., 0., 16., 0., 0.]],
[[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 18., 0.]],
[[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 20.]]]])
This is final gradients compute by pytorch, how can I get the same result by using Jocobian and vector above?
b25.grad
tensor([[200., 4., 6., 8., 10.],
[ 12., 14., 16., 18., 20.]])