Problems in model init method

Friends help me! Recently I was running a paper with open source model code, and after some structural changes, I changed the xavier init method to kaiming init method, and the final result of the model improved by 3 points(comparied with xavier init method), is this possible? Or is it possible that there is a problem that causes the final result to be wrong.

Initial value is just the initial.
Those methods are proposed to prevent the gradient explosion/vanishing and something.
I think there’s no critical difference between them.

No problem:)

But why is the final result so different, without changing the other structure of the model.

  • Train more epochs
  • Do many trainings to validate stable result.
  • Some tasks show the different results depend on the initializer

Would help.

Thanks a lot! I will try this methods