Model.weight ordering question

jxmmy7777 · February 27, 2022, 8:30am

Hello, I would like to ask about how model.weight is ordered. The reason why I’m asking is because when I print out the weight and gradient, it seems like one neuron does not get updated. Is the last row corresponds to the last neuron? Or is the order reversed, which the last row corresponds to the first neuron of the decoder.
Note that my decoder layer is in_dim * 30

print("grad",self.model.decoder.weight.grad)
print("grad",self.model.decoder.weight)

Thanks!

ptrblck · February 27, 2022, 10:47pm

I’m not sure how you are defining “first”, but here is an example showing how a linear layer’s weight is used to create the output:

lin = nn.Linear(2, 3, bias=False)
with torch.no_grad():
    weight = torch.arange(2*3).float().view(3, 2)
    lin.weight.copy_(weight)

print(lin.weight)
# > Parameter containing:
#   tensor([[0., 1.],
#           [2., 3.],
#           [4., 5.]], requires_grad=True)

x = torch.arange(1, 5).float().view(2, 2)
print(x)
# > tensor([[1., 2.],
#           [3., 4.]], requires_grad=True)    

out = lin(x)
print(out)
# > tensor([[ 2.,  8., 14.],
#           [ 4., 18., 32.]], grad_fn=<MmBackward0>)

# corresponds to
# [[(1*0 + 2*1)=2, (1*2 + 2*3)=8, (1*4 + 2*5)=14],
#  [(3*0 + 4*1)=4, (3*2 + 4*3)=18, (3*4 + 4*5)=32]

jxmmy7777 · February 28, 2022, 7:58pm

Thank u for ur reply!