I’ve built a customLSTM model for a prediction problem that I’ve been working on, and the training loss is decently low so I know that it has been training well. However, right after training, when I run the exact same dataset in model.eval() mode or using torch.no_grad() and not applying optimizer and loss function, the loss drastically increases. Since the model after training mus give somewhat the same prediction as during training, though I understand that gradient computation can change the model, it shouldn’t be responsible for increasing the error from the order of e-03 to e-01 or even above.
I’m not quite sure what I’m doing wrong.
The model is of the type:
super(customLSTM, self).__init__()
self.lstm1 = nn.LSTM(5, 50, num_layers=2).cuda()
self.linear = nn.Linear(50, 5).cuda()
The custom forward function:
outputs = []
input = input.cuda()
#each = each.view(1,1,-1)
self.lstm1.flatten_parameters()
h_t1, c_t1 = self.lstm1(input)# (h_t1, c_t1))
self.prev_c_t1 = c_t1
h_t1 = h_t1.double()
h_t1.cuda()
output = self.linear(h_t1) #regression
return output
The training is more or less:
self.zero_grad()
out = self.forward(each)
loss = criterion(out, loss_cmp) #loss_cmp is the correct value, out is the prediction
optimizer.zero_grad()
loss.backward()
optimizer.step()
while the test function:
out = self.forward(each)
loss = criterion(out, loss_cmp)
First time posting here, so do let me know if I’m doing something wrong!