So, you are calling __init__ only once and then apply forward on the same input multiple times without doing any optimization or changing the weights in any way, right?
Edit: I don’t know if this has something to do with your issue but the class you declared has something wrong: the methods of the class should have self as first argument. Moreover usually you want to subclass nn.Module and calling super in the constructor.
Yes I’ve done all that (the usual template). I omitted them in the code above so it’s easier to just see my intention of a multi linear layer at the end of the network having different results.
Have you checked manually if the weights are still the same after loading of the model? At this point they should be different because you are returning different results, even though I don’t know why they are changed.
As you can see by running the following example when I load the state_dict the weights are the same as the ones of the original module. By using a simple list to hold the linear modules instead they differ.
import torch
class M(torch.nn.Module):
def __init__(self):
super(M, self).__init__()
self.multi_output = torch.nn.ModuleList([torch.nn.Linear(50, 2) for i in range(10)])
def forward(self, x):
outputs = [head(x) for head in self.multi_output]
return outputs
m = M()
input = torch.zeros(10,50)
print( m(input)[0] )
print( m.multi_output[0].weight )
torch.save( m.state_dict(), 'm.pt' )
m = M()
m.load_state_dict(torch.load('m.pt'))
print( m(input)[0])
print( m.multi_output[0].weight )
only thing i don’t understand is: at first you have (50,2) and then you have (50, 2) again, how is that even possible? there should be (2, x) on the second layer