Suppose I have a scalar tensor t
which requires_grad
and I would like to merge this into a larger tensor, for example:
# Tensor created from scalar `t`; pseudo-code.
M = [[ t**2 , sin(t) ],
[ cos(t) , sqrt(t + 5) ]]
Then the new tensor M
is used to perform some computations and finally the gradients (w.r.t t
) are computed by calling backward
on the result.
Right now I can think of two ways how to create the larger tensor M
from the scalar t
:
- Use nested calls to
torch.stack
, - or create auxiliary tensors and then multiply and add them together.
Here’s a code example to illustrate what I mean:
from functools import partial
from torch import cos, sin, sqrt, stack, tensor
import torch
tensor = partial(tensor, dtype=torch.float64)
t = tensor(2, requires_grad=True)
# Option 1.
M = stack([
stack([ t**2 , sin(t) ]),
stack([ cos(t) , sqrt(t + 5) ])
])
# Option 2.
M = (
t**2 * tensor([[1, 0], [0, 0]])
+ sin(t) * tensor([[0, 1], [0, 0]])
+ cos(t) * tensor([[0, 0], [1, 0]])
+ sqrt(t + 5) * tensor([[0, 0], [0, 1]])
)
x = tensor([1, 2])
y = M @ x
result = y @ y
result.backward()
print(t.grad)
Now my questions is, what is the preferred way to accomplish this task? Maybe there exists even a more dedicated method? I am interested in all aspects, including performance of the resulting graph.
Thanks a lot in advance!