Hi, I have a couple of weather related data like tempreature etc… in time series format.
Now I would like to experiment with regression and classification tasks. (Like predicting tomorrow’s weather,or just simply classifying day as sunny/rainy/cloud …)
For this task I have an LSTM network but I don’t know I shall retain the hidden states or reset them periodically.
Technically speaking:
1.) Shall I invoke the recurrent layer without passing hiden states like this:
lstm_output, _ = lstm(inputs)
2.) Or save the hidden states somewhere and reset them at every minibatch:
if each datapoint is already in timeseries form, the gradients are accumulated through time automatically, and you don’t need to save last batch’s hidden states. Also, since sequences between each batch are not continuous, keeping the hidden states will lead to the LSTM learning the jumps from sequence to sequence
If u don’t pass the hidden states to the lstm, the hidden states will be initialized to zeros, so usually u don’t have to worry about the hidden states.
Also, I have to mention that the batches are sampled randomly. So In side the batch the data is sequential. But the next batch does not contains the next data points.
I don’t think so. For lstm with batch_first=False, the input should be of shape [L, B, H]. For lstm with batch_first=True, the input should be of shape [B, L, H].