Training Ensemble Networks

I am training an ensemble network consisting of two convolutional neural networks. Do I need to define an optimizer for each of them separately or would a single optimizer for the entire model do?

I would advise you to use a separate optimizer for each model/submodel. It gives you more control and you don’t have to syncronize optimizer.zero_grad() and optimizer.step() calls (otherwise, you have to train all the models at the same time). If you decide to use a single optimizer, you can define a different learning policy for each model through parameter groups that are explained in documentation.

You can do with a single optimizer in case you have ensembled the models as parts of a module like this:

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()


    def forward(self,x):
        return (self.model1(x)+self.model2(x))/2.0

But if you have defined 2 separate models and are ensembling them during training, you’d need an optimizer for each one.

A separate optimizer for each model will allow you to have separate learning rates and methodologies in an easier process as compared to a single optimizer.