Hmm. I used signetf_lambda0.95.pkl
checkpoint. Anyway, thank you again for your collaboration.
I’m dealing with a very similar issue–would you mind giving me some feedback?
I’m moving weights from a trained googlenet on lasagne to pytorch. In the lasagne implementation, filters are not flipped. In the pytorch implementation, I switched batchnorm to localresponsenorm and turned on bias in the conv layers to match the lasagne implementation, as well as transposing the weights from the linear layer. I also corrected the noted bug in the pytorch implementation of googlenet (branch 3)
In the end, I get random predictions and the model does not even learn during finetuning. Are you aware of any other lasagne/pytorch differences that I should know of?
Thanks in advance
No, I’m not aware of any specific issues besides what was discussed in this thread. I would recommend to try to port the model layer by layer and make sure they yield the same results first.
Thanks for the response (and all else you do!)
Doing it layer by layer was the way to go. I also realized that the [:]
in
for i, layer in enumerate(self.m3.state_dict()):
self.m3.state_dict()[layer][:] = nn.Parameter(m3_weights[i])
is very necessary lol