Network weights wont update

nn_beginner · March 12, 2022, 1:48pm

Hi,

I hope you can help. I am designing a controller for a system, where the controller NN is trained using an imported NN I made, which is the state space representation of the system.

But it seems the weights are not updating on each epoch, I have attached the code below:

This is my defined controller:
class Net(nn.Module):


def __init__(self, n1, n2, n3):
    super().__init__()
    self.fc1 = nn.Linear((10 * 2), n1)
    self.fc2 = nn.Linear(n1, n2)
    self.fc3 = nn.Linear(n2, n3)
    self.fc4 = nn.Linear(n3, 1)

def forward(self, x):
    x = F.gelu(self.fc1(x))
    x = F.gelu(self.fc2(x))
    x = F.gelu(self.fc3(x))
    x = F.gelu(self.fc4(x))
    return x

below is my training loop:

LR = 0.001 w_decay = 0.005 optimizer = optim.SGD(dpc.parameters(), lr=LR, weight_decay=w_decay) EPOCHS = 50 batch = 2 print(‘Training network…’) dpc.train(mode=True) for epoch in range(EPOCHS): for dd in range(0, len(x_train) - batch): X = x_train[dd:dd + batch] optimizer.zero_grad() output = dpc(X) input_ssm = X.detach().numpy() input_ssm = input_ssm.tolist() input_ssm[0].append(output.detach().numpy().tolist()[0][0]) input_ssm[1].append(output.detach().numpy().tolist()[1][0]) input_ssm = np.array(input_ssm) input_ssm = torch.Tensor(input_ssm) y = ss_model(input_ssm) y = (y*(max_a[1][0] - min_a[1][0])) + min_a[1][0] desired = np.array([[0], [0]]) desired = torch.Tensor(desired) loss = (F.mse_loss(y, desired)) combined_loss += loss loss.backward() optimizer.step()

Many thanks.

KFrank · March 12, 2022, 2:17pm

Hi Finn!

Pytorch’s autograd facility – which automatically calculates the gradients
that are then used to update your network weights when you call
optimizer.step() – only tracks computations performed with pytorch
tensor operations.

So as soon as you call numpy() (and detach(), for that matter) you
have “broken the computation graph” and pytorch can no longer
backpropagate through those calculations.

Rewrite the manipulations that you are performing with numpy using
pytorch tensor operations, and autograd should work and your weights
should update.

Best.

K. Frank

nn_beginner · March 12, 2022, 3:42pm

Thank you for your response, I have updated the training loop as such:


for epoch in range(EPOCHS):

for dd in range(0, len(x_train) - batch):

X = x_train[dd:dd + batch]

optimizer.zero_grad()

output = dpc(X)

input_ssm = torch.empty((1,21))

for t in range(0,20):

input_ssm[0][t] = X[0][t]

input_ssm[0][20] = output[0][0]

y = ss_model(input_ssm)

y = (y*(max_a[1][0] - min_a[1][0])) + min_a[1][0]

desired = torch.Tensor([[0]])

loss = (F.mse_loss(y, desired))

combined_loss += loss

optimizer.step()

Yet the issue still persists, and the weights are not updating, am I doing this incorrectly?

Many thanks,
Finn

KFrank · March 12, 2022, 3:50pm

Hi Finn!

First, let me note that this second version you posted no longer has
loss.backward() in it. If that’s not just a typo, it would prevent your
weights from being updated.

Second, please post a simplified, fully-self-contained, runnable script
that illustrates your issue, together with its output. For example, it looks
like what you posted has two models in it. Can you reproduce your issue
with only one model?

I would also suggest that you try a single forward and backward pass
(rather than the loops over epochs). Does a single pass create grads
for your model weights (e.g., my_model.fc1.weight.grad)? Are these
gradients non-zero? Does optimizer.step() then modify the weights?

Best.

K. Frank