Is there any difference between theano convolution and pytorch convolution?

Hmm. I used signetf_lambda0.95.pkl checkpoint. Anyway, thank you again for your collaboration. :wink:

I’m dealing with a very similar issue–would you mind giving me some feedback?

I’m moving weights from a trained googlenet on lasagne to pytorch. In the lasagne implementation, filters are not flipped. In the pytorch implementation, I switched batchnorm to localresponsenorm and turned on bias in the conv layers to match the lasagne implementation, as well as transposing the weights from the linear layer. I also corrected the noted bug in the pytorch implementation of googlenet (branch 3)

In the end, I get random predictions and the model does not even learn during finetuning. Are you aware of any other lasagne/pytorch differences that I should know of?

Thanks in advance

No, I’m not aware of any specific issues besides what was discussed in this thread. I would recommend to try to port the model layer by layer and make sure they yield the same results first.

1 Like

Thanks for the response (and all else you do!)

Doing it layer by layer was the way to go. I also realized that the [:] in

for i, layer in enumerate(self.m3.state_dict()):
    self.m3.state_dict()[layer][:] = nn.Parameter(m3_weights[i])

is very necessary lol