Train loss unexpectedly increase after epochs 2

Hi, I want to fine-tune vit-tiny/16 model on imagenet-1k with fp16 datatype, however the train loss unexpectedly increase after epochs 2, what may happen?
trainloss