I did a fine-tune on resnet50 and got weird outputs

I insert a 128-dim fc(activated by sigmoid) before the last fc layer and modified the output to 10-dim, freeze all the former layers, and train the last 2 layers on my dataset(about 5000 pics from imagenet), before the first time of training I checked the output of the layer right before the 128-dim layer, they are normal, but after 10 trainings they became all negative, the loss rate increase and didn’t decrease in the succeeding 10 trainings. I don’t know why

Hard to say, maybe too large learning rate?

seems that my inputs were wrong