Let’s have this code:
import torch
x = torch.eye(1, 1, requires_grad=True)
y = 0.5*x+1
z = 3*y
z.backward(gradient=x)
print("gradients:")
print("x:",x.grad, "\ny:",y.grad, "\nz:",z.grad)
It is written that backward()
method parameter gradient
is:
gradient (Tensor or None): Gradient w.r.t. the
tensor. If it is a tensor, it will be automatically converted
to a Tensor that does not require grad unlesscreate_graph
is True.
None values can be specified for scalar Tensors or ones that
don’t require grad. If a None value would be acceptable then
this argument is optional.
What is the intuition to call this parameter gradient
?
Usually we compute the gradient on scalar loss. loss.backward()
.
But what happens if we call the backward on a tensor that is not a scalar value?
Feedback is greatly appreciated.