Hi,
I have implemented an ensemble consisting of 3-layer MLPs with the following architecture:
super(MLP, self).__init__()
self.linear1 = torch.nn.Linear(D_in, H)
self.relu1 = torch.nn.ReLU()
self.batch1 = torch.nn.BatchNorm1d(H)
self.hidden1 = torch.nn.Linear(H, H)
self.relu2 = torch.nn.ReLU()
self.batch2 = torch.nn.BatchNorm1d(H)
self.hidden2 = torch.nn.Linear(H, H)
self.hidden3 = torch.nn.Linear(H, D_out)
self.logSoftMax = torch.nn.LogSoftmax(dim=1)
self.SoftMax = torch.nn.Softmax(dim=1)
When testing the loss for the ensemble I don’t see the loss decreasing when the number of models is increased. For a single model, it starts out with a reasonable value and then as the ensemble increases it goes up and then down. I think it might be something wrong with how we add the predictions. This is how we are doing it:
def avg_evaluate(models):
loss_fn = torch.nn.NLLLoss()
loss = 0
total = 0
for batch_idx, (data, target) in enumerate(test_loader):
y_preds = []
for idx, model in enumerate(models):
model = model.eval()
data = data.view(data.shape[0], -1)
y_pred, _ = model(data)
y_preds.append(y_pred)
loss = loss + loss_fn(torch.div(torch.stack(y_preds, dim=0).sum(dim=0), len(models)), target).item()
print("Final loss")
print(loss / len(test_loader))
loss = loss / len(test_loader)
Does anyone have any idea of what could be wrong? Would appreciate any help! Thanks.