How to share part of weights in different models?

szhang · October 7, 2018, 4:11am

I have ben struggling on how to share the same weights in two models for a long time.

For example, I have two RNN(or RNNCell) models, i.e. RNN1 and RNN2, and I want to share the weight_hh of these two models, which means that RNN1.weight_hh_l0 is the same as RNN2.weight_hh_l0. I have different inputs for those two models and I want the shared weight_hh_l0 could be updated by training these two different RNN models at the same time.

How could I do this? Is there any one that could help?

Thank you very much.

InnovArul · October 7, 2018, 3:55pm

you can assign the particular weights on one RNN to the other.

import torch.nn as nn

rnn1 = nn.RNN(3,4)
rnn2 = nn.RNN(3,4)
rnn1.weight_hh_l0 = rnn2.weight_hh_l0
rnn1.bias_hh_l0 = rnn2.bias_hh_l0

szhang · October 7, 2018, 6:53pm

Thank you very much. How about the following:

w = nn.Parameter(torch.Tensor(4,4))
rnn1 = nn.RNN(3,4)
rnn2 = nn.RNN(3,4)
rnn1.weight_hh_l0 = w
rnn2.weight_hh_l0 = w

I guess it could also work. And another one that I am not sure wether the autograd for w involves rnn1 and rnn2. Does this dw = drnn1(w) + drnn2(w) hold true when I am taking the gradient of w with pytorch autograd?

InnovArul · October 7, 2018, 8:25pm

yes. It could also work in my view.
Autograd takes care of updating the gradients accordingly (adding up the gradients as you mentioned).