I’ve built a customLSTM model for a prediction problem that I’ve been working on, and the training loss is decently low so I know that it has been training well. However, right after training, when I run the exact same dataset in model.eval() mode or using torch.no_grad() and not applying optimizer and loss function, the loss drastically increases. Since the model after training mus give somewhat the same prediction as during training, though I understand that gradient computation can change the model, it shouldn’t be responsible for increasing the error from the order of e-03 to e-01 or even above.
I’m not quite sure what I’m doing wrong.
The model is of the type:
super(customLSTM, self).__init__() self.lstm1 = nn.LSTM(5, 50, num_layers=2).cuda() self.linear = nn.Linear(50, 5).cuda()
The custom forward function:
outputs =  input = input.cuda() #each = each.view(1,1,-1) self.lstm1.flatten_parameters() h_t1, c_t1 = self.lstm1(input)# (h_t1, c_t1)) self.prev_c_t1 = c_t1 h_t1 = h_t1.double() h_t1.cuda() output = self.linear(h_t1) #regression return output
The training is more or less:
self.zero_grad() out = self.forward(each) loss = criterion(out, loss_cmp) #loss_cmp is the correct value, out is the prediction optimizer.zero_grad() loss.backward() optimizer.step()
while the test function:
out = self.forward(each) loss = criterion(out, loss_cmp)
First time posting here, so do let me know if I’m doing something wrong!