I am working on a neural network architecture that seems to not perform an update of the parameters during training. This is resulting in the loss not changing and in many cases the network not training i.e. having a loss higher at the end of training compared to the start. I have checked the parameters using:
before = list(net.parameters())
neuralnet.update()
after = list(net.parameters())
for i in range(len(before)):
print(torch.equal(before[i].data, after[i].data))
It returns True for each iteration.
One solution I have found online suggest to use .clone() on each parameters as the network update is not cloning the tensors. I am not quite sure how to achieve this in my implementation.
Another solution has suggested weight initialisation using the nn.Parameter class, but I am also not sure how to do this in my current implementation.
As it stands, my net is instantiated as follows:
class Node(nn.Module):
def __init__(self):
super().__init__()
# common network weights
self.fc1 = nn.Linear(5, 10)
self.fc2 = nn.Linear(10, 10)
self.fc3 = nn.Linear(10, 5)
def step(self,state):
#single step
state = F.relu(F.linear(state,self.fc1.weight,self.fc1.bias))
state = F.relu(F.linear(state,self.fc2.weight,self.fc2.bias))
state = F.linear(state,self.fc3.weight,self.fc3.bias)
return state
def forward(self, input_seq):
state = Variable(torch.zeros(5))
for k in range(len(input_seq)):
state[0] = input
state = self.step(state)
return state
As for the training loop:
I have:
for k in range(10):
input = Variable(torch.rand(2,1))
target = some_function(input)
output = net(input)
loss = criterion(output, target)
loss.backward()
optimizer.step()
optimizer.zero_grad()
How do I ensure parameters are being updated in my network and if need be how can I change my weights initialisation for the update to happen.