I’d like to force a single weight to be the same in two different linear layers and have them optimise jointly.
I’ve tried the following, but now I can’t optimise since the ‘hidden_layer’ is no longer a leaf node.
class Net(nn.Module):
def __init__(self, dim_in, dim_out, dim_hidden):
super(Net, self).__init__()
self.input_layer = nn.Linear(dim_in, dim_hidden)
self.hidden_layer = nn.Linear(dim_hidden, dim_hidden)
self.hidden_layer.weight[0,0] = self.input_layer.weight[0,0] # <- this is my attempt at weight-sharing
self.out_layer = nn.Linear(dim_hidden, dim_out)
Is there anyway to overcome this? I’d like to have build up clusters of shared weights if this is possible?