Hello,
I’m currently a student doing offline character recognition, but there is an issue with the model.parameters() not updating during training. Here is a copy of my train_function and forward. I used F.log_softmax() for the forward, and nll_loss() for the loss function. All the intermediate tensors (output, input, loss, etc) appear to be correct. However, when computing “post - pre”, there is no update to the model parameters, which is very confusing.
print("shape coming in is "+str(x.shape))
x = F.max_pool2d(F.relu(self.conv1(x)), self.kernel)
print("shape after round 1 is "+ str(x.shape))
x = F.max_pool2d(F.relu(self.conv2(x)), self.kernel)
print("shape after round 2 is "+str(x.shape))
x = F.max_pool2d(F.relu(self.conv3(x)), self.kernel)
print("shape after round 3 is "+str(x.shape))
x = x.view(-1, self.flatten_features(x))
print("shape after round 4 view is "+str(x.shape))
x = F.relu(self.fc1(x))
print("shape after round 5 linear 1 is "+str(x.shape))
x = self.fc2(x)
print("shape after round 6 linear 2 is "+str(x.shape))
return F.log_softmax(x)
optimizer.zero_grad()
print(“OUTPUT: {}{}".format(output, output.shape))
loss = F.nll_loss(output, chin_char)
print("LOSS: {}{}”.format(loss,loss.shape))
pre = list(model.parameters())[0]
loss.backward()
optimizer.step()
post = list(model.parameters())[0]