Hey,
I am trying to train an ensemble of models on a dataset. At the moment I am using a wrapper, which wraps all my models of the ensemble into a Module:
class Ensemble(nn.Module):
def __init__(self, models):
super().__init__()
self.models = models
def forward(self, x):
y = []
for model in self.models:
y.append(model(x))
return y
This works well, but I think it is not very efficient, because the for-loop does not parallelize well and I still need a lot of memory for the backward-pass, since all results are collected in a single loss function. What would be the best way, to make the training of this ensemble more efficient?
I was thinking of separating the models during training and putting them on different GPUs, but I do not enough GPUs to put every model on a single GPU…
Is it somehow possible to loop over the dataset and over the models in order to efficiently distribute the training over the GPUs? Is there maybe a totally different approach?
Thanks for helping!