Basic "manual two" 2 layers

In the below code; building the a1 tensor seems to “lose” the gradients - what would be the correct way to use l11 and l12 in further computations that would be included in the backward call?

x = torch.tensor([3.0,5.0], requires_grad=True)
w11 = torch.tensor([5.0,2.0], requires_grad=True)
w12 = torch.tensor([1.5,3.5], requires_grad=True)

w21 = torch.tensor([-0.5,2.0], requires_grad=True)

l11 = x@w11
l12 = x@w12
a1 = torch.tensor([l11, l12], requires_grad=True)
l21 = w21@a1

l21.backward()

print("x")
print(x.grad)
print("w11")
print(w11.grad)
print("w12")
print(w12.grad)
print("w21")
print(w21.grad)
print("a1")
print(a1.grad)
print("l21")
print(l21.grad)

Outputs:


x
None
w11
None
w12
None
w21
tensor([25., 22.])
a1
tensor([-0.5000,  2.0000])
l21
None

I think replacing this line

with

a1 = torch.cat((l11, l12), 1)
1 Like

Ah right; thanks for the pointer; it errors on RuntimeError: zero-dimensional tensor (at position 0) cannot be concatenated and first google gives a few hits which I need to try wrap my head around; but do think that is in the right direction

Hi,

@CedricLy 's suggestion is the correct way of doing it.

Note that torch.tensor(another_tensor) works like tensor.clone().detach() which causes the trouble you have mentioned.
Also, you are printing a1.grad which gives you a tensor instead of None because it should not be a leaf tensor but because of the torch.tensor(tensor, requires_grad=True), it creates a leaf tensor.

About the second issue, l11 and l12 are scalars but they should be at least 1D tensors which you achieve it by using tensor.unsqueeze(dim).

But as a general tip, if you use any functions other than methods provided by PyTorch, you are breaking the computational graph. So, always try to use PyTorch semantics as it also help you to find bugs later on.
And, following tutorials can help you a lot as they are very well written (and of course documentations).

Bests

1 Like