Model.weight ordering question

Hello, I would like to ask about how model.weight is ordered. The reason why I’m asking is because when I print out the weight and gradient, it seems like one neuron does not get updated. Is the last row corresponds to the last neuron? Or is the order reversed, which the last row corresponds to the first neuron of the decoder.
Note that my decoder layer is in_dim * 30

print("grad",self.model.decoder.weight.grad)
print("grad",self.model.decoder.weight)

Thanks!

I’m not sure how you are defining “first”, but here is an example showing how a linear layer’s weight is used to create the output:

lin = nn.Linear(2, 3, bias=False)
with torch.no_grad():
    weight = torch.arange(2*3).float().view(3, 2)
    lin.weight.copy_(weight)

print(lin.weight)
# > Parameter containing:
#   tensor([[0., 1.],
#           [2., 3.],
#           [4., 5.]], requires_grad=True)

x = torch.arange(1, 5).float().view(2, 2)
print(x)
# > tensor([[1., 2.],
#           [3., 4.]], requires_grad=True)    

out = lin(x)
print(out)
# > tensor([[ 2.,  8., 14.],
#           [ 4., 18., 32.]], grad_fn=<MmBackward0>)

# corresponds to
# [[(1*0 + 2*1)=2, (1*2 + 2*3)=8, (1*4 + 2*5)=14],
#  [(3*0 + 4*1)=4, (3*2 + 4*3)=18, (3*4 + 4*5)=32]
1 Like

Thank u for ur reply!