Batch normalization for multiple datasets?

I am working on a task of generating synthetic data to help the training of my model. This means that the training is performed on synthetic + real data, and tested on real data.

I was told that batch normalization layers might be trying to find weights that are good for all while training, which is a problem since the distribution of my synthetic data is not exactly equal to the distribution of the real data. So, the idea would be to have different ‘copies’ of the weights of batch normalization layers. So that the neural network estimates different weights for synthetic and real data, and uses just the weights of real data for evaluation.

My question is, how to perform batch normalization in the aforementioned case?

I’m not sure, if the weight (and bias) would be of most interest or the running stats as well.
In any case, you could create different batchnorm layers (for the synthetic and real data) and switch between them during training e.g. by passing an additional argument to the forward and using it as a condition to select the desired layer(s).

1 Like

Thanks a lot!
I am trying to implement that but I am stuck in another problem.
My model is quite deep so I have multiple batchnorm layers, I thought about creating copies of them and save in a ModuleDict.
But on forward I need to go through all batchnorm layers and assign them the desired layers. This I dont know how to. I thought it would be nice to access the layers by name, something like:
model.modules[name] = synth_batchnorm.
I tried also something like model.named_modules()[i] = synth_batchnorm, but I get the error ‘generator object does not support item assignment’.
How that can be done?

I don’t know what exactly you are trying to do, but you can access the modules (layers) via their name, e.g.:

class MyModel(nn.Module):
    def __init__(self):
        self.fc = nn.Linear(1, 1)
    def forward(self, x):
        x = self.fc(x)
        return x

model = MyModel()
> Linear(in_features=1, out_features=1, bias=True)
model.fc = nn.Conv2d(3, 6, 3, 1, 1)
> Conv2d(3, 6, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))

As a further innovation, the proposed CNT strategy trains a network using multiple datasets having different facial landmark annotation.

I was trying to save the copied layers in a dictionary. Then on forward I would iterate through the modules and assign the right copy to the right bn layer. But I dont know how to do that other than explicitly assigning to each layer.
However I managed to save the batchnorm parameters instead of the whole layers in two OrderedDict(), one for real and one for synth. So in the beggining of each epoch I can simply load_state_dict(strict=False) to load only the parameters contained on the dict. It seemed to work, but I am not sure if this is efficient…

Sorry, what do you mean by CNT strategy?

I also wonder if this contant saving and loading of parameters somehow interfere with the optimization in a manner that I am unaware of, because separating these parameters get me worse results than before.