LightGBM port to LSTM network

Background

In binary classification problem of dataset consisting of sequences of events, performance varies across types of models:

  • for sequence of events, LSTM gives an average precision score of APS = 0.60
  • for features engineered with summary stats (first-last occurrence, overall frequency etc), LightGBM gives an APS = 0.80

Dataset specs

  • The dataset is highly imbalance (ratio=0.1).
  • Number of unique tokens is <100

Questions

  1. Would there be any obvious reason for the above that I am missing?
  2. Importantly, what would be a way to combine the LSTM and lightGBM ports?