A problem about eval() confused me when testing

My test code here:

The problem is, if I follow the advise in document, using eval() when testing, my outputs will be quite strange. But if I cancel them, the results turn to be much better.
I’ve learned that this function influences bn-layers and dropout-layers, but I don’t know why there was such big difference between outputs generated in training and testing. When I used my tensorflow code, the outputs were similar. Did I miss something?

My model is cycle-gan.

Can you tell us what do you mean by output being ‘quite strange’ and ‘much better’?. Did you mean the quality of generated images?

Can you post some images here to view?

In my view, the results shouldn’t get affected by .eval(). It would be interesting to see why it happens. Which version of pytorch are you using?

Thank you for your reply.
My level didn’t allow me to post more than one image yesterday. There are some images during my training yesterday.(early stopping)
36_fakeA
with eval()

0_fakeA
without eval()

I’ve solved this problem today. It seemed that I’d used the same batch-norm layer for several times by mistake…hmmm I’m not sure in fact. Is there any difference between the codes?

now:

        norm_layer = get_norm_layer()
        net += [
            nn.Conv2d(ndf * nf_pre, ndf * nf_now, 4, 2, 1),
            norm_layer(ndf * nf_now),
            nn.LeakyReLU(0.2, True)
        ]

former:

        net += [
            nn.Conv2d(ndf * nf_pre, ndf * nf_now, 4, 2, 1),
            get_norm_layer(ndf * nf_now, norm),
            nn.LeakyReLU(0.2, True)
        ]