LSTM for TSC - effect of sequence length

I am developing a TSC classifier based on LSTM with a fully connected layer, below a portion of the code

def forward(self, x):
    if not self.get_is_stateful(): # stateless, reset hidden state at each forward pass
        h0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size).to(self.get_device())
        c0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size).to(self.get_device())
        out_lstm, _ = self.lstm(x, (h0,c0))
    else: # stateful, keep hidden state between forward passes
        h0 = self.hidden_state.detach().clone()
        c0 = self.cell_state.detach().clone()
        out_lstm, (self.hidden_state, self.cell_state) = self.lstm(x, (h0, c0))      
    out = self.fc(out_lstm[:, -1, :]) if self.get_batch_first() else self.fc(out_lstm[-1, :, :]) 

What happens if I train the model with the whole sequence instead of passing the input timestamp per timestamp? From the documentation, I understand the output will have one dimension equal to the length of the sequence, but in this case: 1) the forward method would be called for each timestamp sequentially (and as a consequence hidden state will be the input of the next timestamp) or in parallel? 2)The only thing that I am quite sure is that there is only one model and not lenght(sequence) model in parallel

In the case of PyTorch LSTM, you can pass in the entire sequence and it will handle iterating through it under the hood. As I understand it, there are some tricks used which make it faster than doing so manually, one token at a time.

Regarding feeding the model timestamps, I assume you meant positional encoding? As far as I know, no study has been done that has shown any benefit (or lack there of) for adding positional encoding to a recurrent neural network model(i.e. LSTM). That’s only been demonstrated to work in Transformers. It’s believed that sequential information is learned and held within the hidden state, and updated at each token pass.

Ok, it will handle iterating but the LSTM pytorch module (not mine forward implementation) I guess, so how can I force pytorch lstm module to work as a stateful lstm (so by using previous hidden and cell state output for the current step, for each step in the sequence)?