Let’s say I have a simple network:
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.fc1 = nn.Linear(10, 5, bias=False)
self.fc2 = nn.Linear(5, 2, bias=False)
def forward(self, x):
x = x.view(-1, 10)
x = F.relu(self.fc1(x))
x = self.fc3(x)
return x
model = Net()
With my code I obtain a state_dict
of different dimensions, for example:
fc1.weight
tensor([[ 0.1149, 0.2626, -0.0651, 0.0000, -0.0510, 0.0335, 0.2863, 0.0000,
-0.1991, 0.0000],
[ 0.0166, -0.1621, 0.0535, -0.2953, -0.2285, -0.1630, 0.1995, 0.1854,
-0.1402, -0.0114]])
fc2.weight
tensor([[ 0.1775, 0.2999],
[-0.3467, -0.2310]])
This state_dict
can not be loaded in model
due to the dimension mismatch. I cannot manually change the Net
class because different executions may lead to different state_dict
.
Is there a way to modify the architecture of the network, while preserving its forward logic and overall structure (type and order of layers), so that I can load this state_dict
without manually changing the class?
Thank you in advance.