I have implemented an ensemble consisting of 3-layer MLPs with the following architecture:
super(MLP, self).__init__() self.linear1 = torch.nn.Linear(D_in, H) self.relu1 = torch.nn.ReLU() self.batch1 = torch.nn.BatchNorm1d(H) self.hidden1 = torch.nn.Linear(H, H) self.relu2 = torch.nn.ReLU() self.batch2 = torch.nn.BatchNorm1d(H) self.hidden2 = torch.nn.Linear(H, H) self.hidden3 = torch.nn.Linear(H, D_out) self.logSoftMax = torch.nn.LogSoftmax(dim=1) self.SoftMax = torch.nn.Softmax(dim=1)
When testing the loss for the ensemble I don’t see the loss decreasing when the number of models is increased. For a single model, it starts out with a reasonable value and then as the ensemble increases it goes up and then down. I think it might be something wrong with how we add the predictions. This is how we are doing it:
def avg_evaluate(models): loss_fn = torch.nn.NLLLoss() loss = 0 total = 0 for batch_idx, (data, target) in enumerate(test_loader): y_preds =  for idx, model in enumerate(models): model = model.eval() data = data.view(data.shape, -1) y_pred, _ = model(data) y_preds.append(y_pred) loss = loss + loss_fn(torch.div(torch.stack(y_preds, dim=0).sum(dim=0), len(models)), target).item() print("Final loss") print(loss / len(test_loader)) loss = loss / len(test_loader)
Does anyone have any idea of what could be wrong? Would appreciate any help! Thanks.