Updating model parameters after each batch during inference / testing

I am dealing with a time series forecasting problem. When the model is in production, we will likely update the model weights daily (either by training from scratch on the full dataset or by updating the model parameters based on the loss produced on the most recent observations). This is so that the model has always “seen” the most recent data.

During development, I wish to simulate the conditions the model would work under when in production. I.e. during model testing, I wish to update the model parameters after each mini-batch.

In psuedo-ish code I imagine I could do something like this:

def test_model(...):
    for i, batch in enumerate(testloader):

        input, target = batch


        prediction = model(input) 


        loss = criterion(prediction, target)



I’m not quite sure if this would produce reliable results (i.e. results that actually express the model’s predictive ability).

I’m experimenting with several different architectures. They all use nn.Dropout and some of them use LayerNorm.

The LayerNorm documentation says that the “layer uses statistics computed from input data in both training and evaluation modes” so I would assume setting the model.eval() mode is not strictly necessary to make the LayerNorm layer work as intended.

But I suppose the Dropout layers do require model.eval() during testing / inference.


  1. Does it even make sense to update model parameters after each batch during inference and testing? Or are there some pitfalls I am not aware of?
  2. If the answer to Q1 is “yes”, does the approach described above (and indicated in the code) make sense? Or would it be better to update model weights based on test instances in a separate function where predictions are not made in model.eval() mode so that Dropout is enabled during prediction?
  3. For the approach laid out in the code, is model.eval() necessary?

Thanks in advance

  1. It could make sense assuming you have a valid target for these samples, which is usually not the case. If you already have the corresponding targets for each sample during deployment, you might not need to use a neural network at all as it would only try to predict the same target.

  2. You would need to zero out the gradients and I don’t think it’s necessary to call model.train() before the loss calculation and backward pass.

  3. Yes, I would assume your model is in eval() mode the entire time during deployment.


Just to clarify: My questions solely regard how to go about training and testing the model before deploying to production.

In production, we would not receive the actual target values until a few hours after having produced the forecasts. So in production, I would not perform the inference and then update the model immediately. But for the purpose of training and evaluating the model prior to deployment, I thought I might as well update model parameters straight away - IF the approach laid out above makes sense :slight_smile:

Ah OK, thanks for clarifying the use case as I misunderstood it.
I think it makes sense to fine-tune the already deployed model once you have collected new training data, but I would most likely treat it as a “standard fine-tuning” approach. I.e. call model.train(), fine-tune it with the new samples, call model.eval() and deploy it again.

@tom is working on TorchDrift which detects “drifts” in your already deployed model and could warn you when the accuracy decreases, fine-tuning would be needed etc. and I think it could be an interesting tool for your use case.

1 Like