I am working with ensembles of neural networks, roughly of the following form:
class NNEnsemble():
def __init__(self, n_estimators) :
super().__init__()
self.estimators = nn.ModuleList([ generate_model() for _ in range(self.n_estimators)])
def forward(self, x):
out = torch.stack([est(x) for est in self.estimators], dim=1)
return torch.sum(out, dim=1)
Here, generate_model()
generates a small base model (e.g. MLP 3 layers with 128 hidden neurons). As it turns out, the list-comprehension/for-loop during forward is rather slow. In fact, I only utilize < 10% GPU for an ensemble with 100 networks of small size. Is there any way to speed-up this operation, e.g. by using a “parallel” list comprehension?
I found one post about grouped convolution which mentions a similar structure, but there was no real solution besides waiting for cudnn support.