I am dealing with a time series forecasting problem. When the model is in production, we will likely update the model weights daily (either by training from scratch on the full dataset or by updating the model parameters based on the loss produced on the most recent observations). This is so that the model has always “seen” the most recent data.
During development, I wish to simulate the conditions the model would work under when in production. I.e. during model testing, I wish to update the model parameters after each mini-batch.
In psuedo-ish code I imagine I could do something like this:
def test_model(...):
for i, batch in enumerate(testloader):
input, target = batch
model.evaluate()
prediction = model(input)
model.train()
loss = criterion(prediction, target)
loss.backward()
optimizer.step()
I’m not quite sure if this would produce reliable results (i.e. results that actually express the model’s predictive ability).
I’m experimenting with several different architectures. They all use nn.Dropout and some of them use LayerNorm.
The LayerNorm documentation says that the “layer uses statistics computed from input data in both training and evaluation modes” so I would assume setting the model.eval() mode is not strictly necessary to make the LayerNorm layer work as intended.
But I suppose the Dropout layers do require model.eval() during testing / inference.
Questions:
- Does it even make sense to update model parameters after each batch during inference and testing? Or are there some pitfalls I am not aware of?
- If the answer to Q1 is “yes”, does the approach described above (and indicated in the code) make sense? Or would it be better to update model weights based on test instances in a separate function where predictions are not made in model.eval() mode so that Dropout is enabled during prediction?
- For the approach laid out in the code, is model.eval() necessary?
Thanks in advance