Consider a scenario where I want to fine tune a trained model for a downstream task.
The trained model used [0, 1] normalization during its training. Now, is it okay if I do a z-score normalization [ input = (input - mean) / std ] while fine tuning?
I am not sure how to make sense of this scenario. Please help.
I would guess your initial loss might be a bit higher using this new normalization approach, as your model would get “new” samples. Fine tuning the model might work, but again I would guess that you might get better results if you fine tune the entire model (not only the last layer(s)). These are just my guesses, so I would recommend to run some experiments and compare both approaches (and please share the results if you plan on running these tests! ).