Ensemble of pretrained models

Hi,

I have 3 models trained on internal data. I want to ensemble them by calculating mean of their output.

build network():
  model1 = InceptionV3()
  model1.load_state_dict(torch.load('model1_weights.pth'))
  model1.eval()
  
  model2 = ResNet50()
  model2.load_state_dict(torch.load('model2_weights.pth'))
  model2.eval()
  
  model3 = AnotherModel()
  model3.load_state_dict(torch.load('model3_weights.pth'))
  model3.eval()
  
  for param in model1.parameters():    
      param.requires_grad = False
  for param in model2.parameters():
      param.requires_grad = False
  for param in model3.parameters():
      param.requires_grad = False
  
  model= Ensemble([model1, model2, model3])

return model

class Ensemble(nn.Module):

    def __init__(self, model_list):
        super().__init__()
        self.model_list = nn.ModuleList(model_list)
    
    def forward(self, x,y):
        return torch.stack([model(x,y) for model in self.model_list]).mean(0)


Now when i train the model using typical pytorch training process, I am getting error
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn[1,0]

Am i using a right approach ?

The error you are seeing will be raised if you try to call backward on the output of any of the 3 models since you have frozen all trainable parameters by setting their .requires_grad attribute to False.
If you want to compute the gradients and thus need to call backward on any output/loss, you would have to make sure some parameters are still trainable.

1 Like

I’ve set for model3 require_grad = True and model3.train() instead of eval(). Just for my understanding, when training ensembles of pretrained models, do they have to be set in the evaluation mode ?

when you train models, they should be in train mode (even if they are in an ensemble)