# Only one layer of weights update, why?

I create a NN with one hidden layer, so I have two tensor of weights.
Then I train it, so I expect the weights to update. The problem is that only the second set of weights updates, the first stays the same!
Why? I have tried different possible solutions, but nothing.

This is the code (I removed the imports and the dataset upload), I print “par” before training and after training, it is clear that the first set of weights did not change:

``````class NN1(nn.Module):
def __init__(self, D_1, D_2, H, D_out):
super().__init__()
self.flatten = nn.Flatten()
self.linear_relu_stack = nn.Sequential(
nn.Linear(D_1*D_2, H, bias=False),
nn.ReLU(),
nn.Linear(H, D_out, bias=False),
)

def forward(self, x):
x = self.flatten(x)
logits = self.linear_relu_stack(x)
return logits

model = NN1(28, 28, 756, 10)
par = list(model.parameters())
print(par)

def train_loop(model, training_data, batch_size, eta, epochs):

optimizer = torch.optim.SGD(par, lr=eta)
loss = nn.CrossEntropyLoss()

for epoch in range(epochs):
for batch_idx, (X, y) in enumerate(train_dataloader):
ypred = model(X)
train_loss = loss(ypred, y)
if batch_idx % (6400/batch_size) == 0:
print(f'Epoch [{epoch + 1}/{epochs}] Batch [{batch_idx}/{len(train_dataloader)}] Training loss: {train_loss.item()}')

train_loss.backward()
optimizer.step()

return model

print(par)
``````

Just printing the parameter might not show enough decimals so clone the original parameter and subtract the updated one from it. Also check the gradients via the `.grad` attribute to see how small they are.

Thank you very much.

Unfortunately I have tried doing both things, but nothing. When I clone the original parameter and subtract the updated one from it I get a tensor of all 0, so they are perfectly equal I believe (and instead for the second set of parameters it’s different from 0).

When I print the gradients during the training loop they appear to be exactly 0 (I don’t know if there is an approximation that I can’t see, but I don’t think so).

PROBLEM SOLVED
By reading some other posts I understood the problem. Actually there was not really a problem. When I was printing the gradients and/or the weights, the output was clearly truncated, and the part which I was seeing was simply not updated, but other parts which I was not seeing where being updated.