LSTM h_0 & c_0 storage when using One by one forward step

Hello everyone,
I would like to create a real-time anomaly checker for a sensor. To do this, I initialise a self.h_0 & self.c_0 in __init__ and with each forward step I save the state in it (very similar to that: But now I have the problem that at the beginning of each new time series the old values of the previous series are stored in self.h_0 & self.c_0. For this reason, I always get different results when I run through the same data several times. Is there a good solution for this Problem?

I can use an if init_hidden: in def forward(self, input, init_hidden=False): that I call at the beginning of a new time series. However, I have the problem that I can then no longer torch.jit.trace() the network.

torch.jit.trace does not support data-dependent control flow or any other conditions, while torch.jit.script did. However, note that TorchScript in general is in maintenance mode and the current recommendation is to use torch.compile instead.

I have now solved it by outputting h_0 and c_0 and reading them back in as input. But I only have to do this if I feed it with live data. For training, I can read in the normal way, as a time series blog with many points in time. I don’t think this is the best solution, but it gives me the most freedom over h_0 & c_0.

I want to use the net later for IOS Devises. So I have to rely on torch.jit.trace or torch.jit.script or is it also possible with torch.compile?

I don’t know if torch.compile is supported on iOS devices but would guess ExecuTorch would be the right approach to deploy on these devices. CC @marksaroufim as he would know more about the support.

Oh nice, I have now used CoreMLTools to get it onto the device. But I think it’s worth having a look at ExecuTorch.
Looks really interesting, I can see the cross-platform benefit. But is there an advantage in performance, especially compared to Apple’s own CoreML?
Do benchmarks already exist?