I’m coming across what seemingly appears to be a bug in my code, best described by this example…
The variable I want to include in the computation graph is theta['dm']
. This variable gets used in m
, then included in B
. Finally, it gets used in a state transition equation x = A@x + B@u
, and a trajectory cost gets calculated.
Here is a small code bock:
theta = {'dm' : torch.ones(1, requires_grad=True)}
m_bar = 1
m = m_bar * (1+theta['dm'])
# tensor([2.], grad_fn=<MulBackward0>)
B = torch.tensor([[0,0],
[dt/m,0],
[0,0],
[0,dt/m]],
requires_grad=True)
# tensor([[0.0000, 0.0000],
# [0.0500, 0.0000],
# [0.0000, 0.0000],
# [0.0000, 0.0500]], requires_grad=True)
# B.grad_fn = None
# B.is_leaf = True
The problem I’m experiencing is theta['dm']
does not appear to be included on the computational graph. When calculating the trajectory cost mentioned above and performing cost.backward()
, then optimizer_theta.step()
, the theta['dm'].grad = None
, which appears that its not on the computation graph.
A warning sign I see – when created the B
tensor, shouldn’t there be a grad_fn
and B.is_leaf=False
? Seems like I’m doing an incorrect assignment of B
, if I want theta['dm']
to be included in B
.
An even simpler example:
a = torch.tensor(2.0, requires_grad=True)
b = torch.tensor(3.0)
c = a*b # works --> tensor(6., grad_fn=<MulBackward0>)
c = torch.tensor([a*b]) # does not work --> tensor([6.0]), requires_grad=False, grad_fn=None
c = torch.tensor([[a*b],[a*b]]) # does not work
What is wrong with the second and third assignment of c, c=torch.tensor([a*b])
and c = torch.tensor([[a*b],[a*b]])
?
Appreciate any help! Thank you.
update
Doing some digging, I may have found a hack to work. Does this look correct / elegant?
B1 = torch.tensor([0, 0])
B2 = torch.cat((dt/m, torch.tensor([0])))
B3 = torch.tensor([0, 0])
B4 = torch.cat((torch.tensor([0]), dt/m))
B = torch.stack((B1, B2, B3, B4))