Loading state_dict from two pre-trained models

iariav · July 1, 2018, 11:42am

hi,
I have a network with two different branches (each branch is for a different modality). I pre-trained each branch individually and now i want to train the full network and initialize the weights from my pre-trained models.
will it work if i just do:

two_branch_net.load_state_dict(torch.load(first_branch_net),strict=False)
two_branch_net.load_state_dict(torch.load(second_branch_net),strict=False)

or will the second load “override” the first one somehow (assume the layer names in the two branches are different and unique)

thanks

Shani_Gamrian · July 1, 2018, 2:30pm

If the layers names are unique it should work. You can check it by initializing all the values to zero at the beginning and then see the values of the weights after each line.