Sorry if this more of a general methodology question than PyTorch.
I’m training a mushroom classifier, and some classes might be merged for various reasons (classes are small, labels are often switched, the end user will consider them similar, etc). Is it best to create the merged classes before training or train with the original classes and “merge” afterwards?
If I merge afterwards , should I add up the probability estimates after soft-maxing or combine (add) the neural net output before applying softmax?
If merging afterwards usually works equally well, it has the advantage that one can experiment with different mergers without retraining.
Actually, using more classes usually needs more training and tricks to help network to learn. So, first criterion is that if you can make sure you can still achieve desirable results for not merged case.
Now let’s assume you have a pretrained model with all mushroom classes but you are interested in common ones and as you have mentioned you want to merge classes with small contribution. In this case, I prefer merging probs after softmax as it is more intuitive. In both cases it should work just fine as sotfmax works more like a normalization term on logits.
In my case, I used a pretrained scene recognition network, but I was not interested in different type of chairs or something like that. So, I just got final probabilities and added up probs.