I am confused with hidden state initialization during validation/testing for LSTM.
During training, we initialize hidden states and cell states as zero and start the training, but what needs to be done during validation or testing?
- Should we reinitialize hidden states and cell states as zero during validation? - In that case, all the learning would be lost right?
- Or can we reuse the hidden states learned during training?
Validation
with torch.no_grad():
val_h = net.init_hidden(batch_size) **# initialize to zero --> is this the right way?**
net.eval()
for inputs, labels in valid_loader:
val_h = tuple([each.data for each in val_h])
inputs, labels = inputs.cuda(), labels.cuda()
log_ps, val_h = net(inputs, val_h)
valid_loss += criterion(log_ps, labels.long())