Ok, interesting idea.
So as far as I understand your approach, each models uses its mean and std, which were calculated on the positive samples for the appropriate class. Am I right?
Did this approach outperform 6 different models using a global mean and std?
However, you could relocate the standardization into the Dataset
returning 6 differently normalized samples.
Through this, you could push some computation into a DataLoader
, i.e. CPU, while your model ensemble calculates the predictions.
What is the overall accuracy of the model ensemble compared to the first model (~40% accuracy)?