I load my trained model from checkpoint for a fine-tune training.
then when I do:
output seems OK, loss scale is same as at the end of pre-train.
output is totally different, very bad - just like it’s a “training from scratch”.
- the model pretrained with DDP
- the model has BN layers
Am I doing something wrong?