I am confused with hidden state initialization during validation/testing for LSTM.
During training, we initialize hidden states and cell states as zero and start the training, but what needs to be done during validation or testing?
- Should we reinitialize hidden states and cell states as zero during validation? - In that case, all the learning would be lost right?
- Or can we reuse the hidden states learned during training?
with torch.no_grad(): val_h = net.init_hidden(batch_size) **# initialize to zero --> is this the right way?** net.eval() for inputs, labels in valid_loader: val_h = tuple([each.data for each in val_h]) inputs, labels = inputs.cuda(), labels.cuda() log_ps, val_h = net(inputs, val_h) valid_loss += criterion(log_ps, labels.long())