Network Parameters Not Being Updated (loss.backward() and optimizer.step() not performing network update)

princecella · April 16, 2019, 1:29am

I am working on a neural network architecture that seems to not perform an update of the parameters during training. This is resulting in the loss not changing and in many cases the network not training i.e. having a loss higher at the end of training compared to the start. I have checked the parameters using:

before = list(net.parameters())
neuralnet.update()
after = list(net.parameters())
for i in range(len(before)):
print(torch.equal(before[i].data, after[i].data))

It returns True for each iteration.

One solution I have found online suggest to use .clone() on each parameters as the network update is not cloning the tensors. I am not quite sure how to achieve this in my implementation.

Another solution has suggested weight initialisation using the nn.Parameter class, but I am also not sure how to do this in my current implementation.

As it stands, my net is instantiated as follows:

class Node(nn.Module):

  def __init__(self):
 
    super().__init__()
    # common network weights 
    self.fc1 = nn.Linear(5, 10)
    self.fc2 = nn.Linear(10, 10)
    self.fc3 = nn.Linear(10, 5)
    
 
 def step(self,state):
  #single step 
    state = F.relu(F.linear(state,self.fc1.weight,self.fc1.bias))
    state =  F.relu(F.linear(state,self.fc2.weight,self.fc2.bias))
    state =  F.linear(state,self.fc3.weight,self.fc3.bias)
    return state

 def forward(self, input_seq):
    state = Variable(torch.zeros(5))
     
    for k in range(len(input_seq)):
        state[0] = input
        state = self.step(state)
    return state

As for the training loop:

I have:

for k in range(10):

    input = Variable(torch.rand(2,1))
    target = some_function(input)
    output = net(input)
    loss = criterion(output, target)
    loss.backward()
    optimizer.step()
    optimizer.zero_grad()

How do I ensure parameters are being updated in my network and if need be how can I change my weights initialisation for the update to happen.

MariosOreo · April 17, 2019, 1:51pm

Hello,

Try the snippet as follows:

old_params = {}
for name, params in model.named_parameters():
    old_params[name] = params.clone()

# perform update
for name, params in model.named_parameters():
    if (old_params[name] == params).all():
        print("True")