#custom initiliazation of weight
self.hidden_layer_1.weight = torch.nn.Parameter(torch.from_numpy(weights))
a = self.hidden_layer_1.weight
for epoch in range(num_epochs):
epoch_loss_train = 
epoch_loss_validation = 
for i, (x,y) in enumerate(zip(feature_trainloader,label_trainloader), 0):
output = self.forward(x)
target = y
loss = self.CrossEntropyLoss(output, target)
y_true_train = torch.cat((y_true_train, target))
_, pred_class = torch.max(output, dim=1)
y_pred_train = torch.cat((y_pred_train, pred_class))
b = self.hidden_layer_1.weight
print(torch.equal(a, b)) #True
I don’t understand how weights are the same after 10 epochs even though the loss is changing
Could you share a colab notebook which can reproduce the issue?
A minimal implementation would work.
As a sanity check, can you actually save (e.g., with something like (.detatch().clone()) (or print) the values before and after to compare them? I have a suspicion that
b are the same reference so they are both updated as the model is being trained.
At this point, you (presumably) have already created your optimizer, so
your optimizer contains a collection of
Parameters that it will be updating.
Because you have overwritten
weight with a new
than having overwritten
weight.data with a new tensor) – and done so
after creating your optimizer – your optimizer does not contain the new
Parameter that is used (via the forward pass) to calculate the loss.
Parameter that is modified by the optimizer-- or would be if it had
a non-trivial gradient – is the old
Parameter that you have, in a sense, “hidden.”)
self.hidden_layer_1.weight doesn’t change, because it is not in
the optimizer’s list of
Parameters to update.
Why the loss is changing even though
is not: I will assume that
self.hidden_layer_1 is a
has both a
weight and a
bias. You haven’t overwritten
bias, so it is
still updated by your optimizer, and still affects the
loss you calculate.
Note that even after fixing the optimizer issue, the weights should not be compared using references.
For example, this code snippet will report the “saved” weights being the same because both references refer to the same underlying tensor. Copying the weights before training will produce the expected results:
a = torch.nn.Linear(128,1)
b = torch.nn.Sigmoid()
optimizer = torch.optim.SGD(a.parameters(), 1e-3)
criterion = torch.nn.BCELoss()
saved = a.weight
detached = a.weight.detach().clone()
for i in range(0, 512):
inp = torch.randn((1, 128))
label = float(torch.sum(inp[64:]) > 0)
label = torch.tensor([[label]])
output = b(a(inp))
loss = criterion(output, label)
optimizer.grad_ = None
print(saved == a.weight)
print(detached == a.weight)
Thanks for the reply.
I do create my optimizer before this statement, you are correct.
So if I understand correctly,
self.hidden_layer_1.weight.data = my_custom_weights should fix it?
I think that will fix it.
My preferred idiom (not that I know what I’m doing) would be: