hi,
I have a network with two different branches (each branch is for a different modality). I pre-trained each branch individually and now i want to train the full network and initialize the weights from my pre-trained models.
will it work if i just do:
two_branch_net.load_state_dict(torch.load(first_branch_net),strict=False)
two_branch_net.load_state_dict(torch.load(second_branch_net),strict=False)
or will the second load “override” the first one somehow (assume the layer names in the two branches are different and unique)
thanks