What's the difference between the two hypothesis?？

yunpei · May 9, 2019, 8:00am

Resnet has a 4 component: layer1, layer2, layer3, layer4.
Hypothesis one：
And I initialize the layer4 use the same parameters, just like this:
self.layer4_1 = model_resnet.layer4
self.layer4_2 = model_resnet.layer4
That means the layer4_1 and layer4_2 pointing to the same parameters, they share the same parameter. I update them alternately.

Hypothesis two：
I just define one layer4 just like this:
self.layer4 = model_resnet.layer4
And I update the layer4 two times than Hypothesis one.

I want to ask what’s the difference between the two hypothesis？Why are the model results different when the model converges？

My English is poor, if you are chinese, we can talk in chinese.

Looking forward to your reply!!

Oli · May 9, 2019, 8:10am

Hello and welcome I Like your post and well done with the images.

You stated that we point to the same layer in hypothesis 1/top image. Whenever we update one of the boxes, the other box is also updated, right?

Then in hypothesis 2: Every time you update the box, that’s the same as updating one of the boxes in hypothesis one - as they are connected. You say that in hypothesis 2, the update happens twice so there are double the amount of updates compared to hypothesis 1, right?

Just what came to mind

yunpei · May 9, 2019, 8:31am

Thanks for you!

I draw a images more clearly.

In hypothesis 1/top image, the layer4_1 and layer4_2 share the same parameters, which is the same in layer4 in hypothesis 2/down image.

I update layer4_1 and layer4_2 alternatively in hypothesis 1/top image and I update layer4 in hypothesis 2/down image twice to get the the amount of updates compared to hypothesis 1.

And I promise the data are all the same.

Oli · May 9, 2019, 8:43am

Then the results should be the same as you suggest, at least to my understanding.

I would make sure that the models are the same after 1 and 2 update steps just as a sanity check. Maybe Pytorch does something funky like a copy when you set two modules to the same, I’m not sure

self.layer4_1 = model_resnet.layer4
self.layer4_2 = model_resnet.layer4

yunpei · May 9, 2019, 8:47am

OK, still thanks for you.

I suggest they are same, but it is appointed that the results are different.

I looking for other people for help.