I want continuously training my model as data is coming online. As I couldn’t know the future’s data, how could I use StandardScaler to cover the coming data?
two solutions could be selected:
- Make a basic model and do the StandardScaler with its data, do not update the StandardScaler in the coming data. Retrain the basic model with StandardScaler at fixed period.
- Make a basic model and do the StandardScaler with its data, update the StandardScaler’s mean and std using flowing mean/std algo with the coming data.
Which one is right for online StandardScaler? Or is there any better solution?
Thanks for your time.