I am training an ensemble network consisting of two convolutional neural networks. Do I need to define an optimizer for each of them separately or would a single optimizer for the entire model do?
I would advise you to use a separate optimizer for each model/submodel. It gives you more control and you don’t have to syncronize optimizer.zero_grad()
and optimizer.step()
calls (otherwise, you have to train all the models at the same time). If you decide to use a single optimizer, you can define a different learning policy for each model through parameter groups that are explained in documentation.
You can do with a single optimizer in case you have ensembled the models as parts of a module like this:
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.model1=model1()
self.model2=model2()
def forward(self,x):
return (self.model1(x)+self.model2(x))/2.0
But if you have defined 2 separate models and are ensembling them during training, you’d need an optimizer for each one.
A separate optimizer for each model will allow you to have separate learning rates and methodologies in an easier process as compared to a single optimizer.