Placing tensor in new Tensor list

jsalfity · August 3, 2021, 11:23pm

I’m coming across what seemingly appears to be a bug in my code, best described by this example…

The variable I want to include in the computation graph is theta['dm']. This variable gets used in m, then included in B. Finally, it gets used in a state transition equation x = A@x + B@u, and a trajectory cost gets calculated.

Here is a small code bock:

theta = {'dm' : torch.ones(1, requires_grad=True)} 
m_bar = 1
m = m_bar * (1+theta['dm']) 
# tensor([2.], grad_fn=<MulBackward0>)

B = torch.tensor([[0,0],
                 [dt/m,0],
                 [0,0],
                 [0,dt/m]],
                 requires_grad=True)

# tensor([[0.0000, 0.0000],
#        [0.0500, 0.0000],
#        [0.0000, 0.0000],
#        [0.0000, 0.0500]], requires_grad=True)

# B.grad_fn = None
# B.is_leaf = True

The problem I’m experiencing is theta['dm'] does not appear to be included on the computational graph. When calculating the trajectory cost mentioned above and performing cost.backward(), then optimizer_theta.step(), the theta['dm'].grad = None, which appears that its not on the computation graph.

A warning sign I see – when created the B tensor, shouldn’t there be a grad_fn and B.is_leaf=False? Seems like I’m doing an incorrect assignment of B, if I want theta['dm'] to be included in B.

An even simpler example:

a = torch.tensor(2.0, requires_grad=True)
b = torch.tensor(3.0)
c = a*b # works --> tensor(6., grad_fn=<MulBackward0>)
c = torch.tensor([a*b]) # does not work --> tensor([6.0]), requires_grad=False, grad_fn=None
c = torch.tensor([[a*b],[a*b]]) # does not work

What is wrong with the second and third assignment of c, c=torch.tensor([a*b]) and c = torch.tensor([[a*b],[a*b]])?

Appreciate any help! Thank you.

update
Doing some digging, I may have found a hack to work. Does this look correct / elegant?

 B1 = torch.tensor([0, 0]) 
 B2 = torch.cat((dt/m, torch.tensor([0]))) 
 B3 = torch.tensor([0, 0]) 
 B4 = torch.cat((torch.tensor([0]), dt/m)) 
  
 B = torch.stack((B1, B2, B3, B4))

Varal7 · August 4, 2021, 2:58am

Your solution does not look like a hack to me.
You should use torch.cat and torch.stack instead of torch.tensor to respectively concatenate and stack tensors while preserving the graph.

jsalfity · August 4, 2021, 5:00am

Thank you! It seems to be working.
I don’t understand why the operators aren’t overloaded for something like torch.tensor([[a*b],[a*b]])