Converting lasagne model to pytorch

I wrote simple script which converts a lasagne model (saved in h5 format) to a pytorch model (the architecture has only convolutional layers -, but am getting erroneous (read blank) predictions

I also was checking the model weights of and comparing them to torch vision’s vgg16, and saw that after the 2nd conv layer, the mean weight of each conv layer is different by an order of 10.
I guess for VGG, it’s due to different input ranges (in pytorch, it’s 0-1 and in lasagne, it’s just subtraction from mean image but not scaling to 0-1)

But for my architecture, I fail to understand how could the two give different results.
The code is here
model in lasagne
model in pytorch

Could a reason be that torch and lasagne have different ways to add relu?! (lasagne allows it as an option in the conv layer itself.)
I won’t mind writing a generic model converter if I get past this :slight_smile:

either the memory layout of the inputs/weights (BCHW vs BHWC for example), or the normalization, I would suspect. Start with a single-layer model, and slowly go towards complex models.

Hey @smth
So I compared the weights and activations from of pytorch and lasagne VGG16 models converted from caffe. They are the same.
My custom model has dilation layers and they are the ones which are resulting in different activations. Still investigating.

dilation might be subtly different in lasagne than in pytorch. maybe some boundary conditions (ceil / floor)

Yeah, screw that.
I just wrote a semantic segmentation training pipeline in pytorch and got things working :smiley: It’s pretty awesome
Thanks for the help