Problems in model init method

dlxdlxdlx · April 1, 2023, 11:55am

Friends help me! Recently I was running a paper with open source model code, and after some structural changes, I changed the xavier init method to kaiming init method, and the final result of the model improved by 3 points(comparied with xavier init method), is this possible? Or is it possible that there is a problem that causes the final result to be wrong.

thecho7 · April 1, 2023, 12:19pm

Initial value is just the initial.
Those methods are proposed to prevent the gradient explosion/vanishing and something.
I think there’s no critical difference between them.

No problem:)

dlxdlxdlx · April 1, 2023, 12:23pm

But why is the final result so different, without changing the other structure of the model.

thecho7 · April 1, 2023, 12:33pm

Train more epochs
Do many trainings to validate stable result.
Some tasks show the different results depend on the initializer

Would help.

dlxdlxdlx · April 1, 2023, 1:05pm

Thanks a lot! I will try this methods