Predicting rare events and their strength with LSTM


I’m currently creating and LSTM to predict rare events. I’ve seen this paper here which suggest: first an autoencoder LSTM for extracting features and second to use the embeddings for a second LSTM that will make the actual prediction. According to them, the autoencoder finds patterns which are then useful for the prediction layers to predict.

In my case, I need to predict if it would be or not an extreme event (this is the most important thing) and then how strong is gonna be. Following their advice, I’ve created the model, but instead of adding one LSTM from embeddings to predictions I add two. One for binary prediction (It is, or it is not), ending with a sigmoid layer, and the second one for predicting how strong will be. Then I have three losses. The reconstruction loss (MSE), the prediction loss (MSE), and the binary loss (Binary Entropy).

The thing is that I’m not sure that is learning anything… the binary loss keeps in 0.5, and even the reconstruction loss is not really good. And of course, the bad thing is that the time series is plenty of 0, and some numbers from 1 to 10, so definitely MSE is not a good metric.

What do you think about this approach?

  1. This is the better architecture for predicting rare events? Which one would be better?
  2. Should I add some CNN or FC from the embeddings before the other to LSTM, for extracting 1D patterns from the embedding, or directly to make the prediction?
  3. Should the LSTM that predicts be just one? And only use MSE loss?
  4. Would be a good idea to multiply the two predictions to force in both cases the predicted days without the event coincide?


1 Like