LSTM accuracies different for train() and eval() on the same dataset (train dataset)

FleetAdmiral · January 16, 2019, 10:39am

I’ve built a customLSTM model for a prediction problem that I’ve been working on, and the training loss is decently low so I know that it has been training well. However, right after training, when I run the exact same dataset in model.eval() mode or using torch.no_grad() and not applying optimizer and loss function, the loss drastically increases. Since the model after training mus give somewhat the same prediction as during training, though I understand that gradient computation can change the model, it shouldn’t be responsible for increasing the error from the order of e-03 to e-01 or even above.
I’m not quite sure what I’m doing wrong.
The model is of the type:

        super(customLSTM, self).__init__()
        self.lstm1 = nn.LSTM(5, 50, num_layers=2).cuda() 
        self.linear = nn.Linear(50, 5).cuda()

The custom forward function:

        outputs = []
        input = input.cuda()
        #each = each.view(1,1,-1)
        self.lstm1.flatten_parameters()
        h_t1, c_t1 = self.lstm1(input)# (h_t1, c_t1))
        self.prev_c_t1 = c_t1
        h_t1 = h_t1.double()
        h_t1.cuda()
        output = self.linear(h_t1) #regression
        return output

The training is more or less:

                    self.zero_grad()
                    out = self.forward(each)

                    loss = criterion(out, loss_cmp) #loss_cmp is the correct value, out is the prediction

                    optimizer.zero_grad()
                    loss.backward()     
                    optimizer.step()

while the test function:

                out = self.forward(each)
                loss = criterion(out, loss_cmp)

First time posting here, so do let me know if I’m doing something wrong!