Let’s say I have a simple network:
class Net(nn.Module): def __init__(self): super(Net, self).__init__() self.fc1 = nn.Linear(10, 5, bias=False) self.fc2 = nn.Linear(5, 2, bias=False) def forward(self, x): x = x.view(-1, 10) x = F.relu(self.fc1(x)) x = self.fc3(x) return x model = Net()
With my code I obtain a
state_dict of different dimensions, for example:
fc1.weight tensor([[ 0.1149, 0.2626, -0.0651, 0.0000, -0.0510, 0.0335, 0.2863, 0.0000, -0.1991, 0.0000], [ 0.0166, -0.1621, 0.0535, -0.2953, -0.2285, -0.1630, 0.1995, 0.1854, -0.1402, -0.0114]]) fc2.weight tensor([[ 0.1775, 0.2999], [-0.3467, -0.2310]])
state_dict can not be loaded in
model due to the dimension mismatch. I cannot manually change the
Net class because different executions may lead to different
Is there a way to modify the architecture of the network, while preserving its forward logic and overall structure (type and order of layers), so that I can load this
state_dict without manually changing the class?
Thank you in advance.