I would like to clarify the case with the image classifier, which I can describe in general, and then I will provide an example in Colab.
I have images of 7 classes that are easily predicted by a simple ResNet classifier with 98+% accuracy. I modified my dataset a bit: I took samples from one class and randomly split them by half - the first half retained the previous class, the other half gained a new class. My goal was to confuse the classifier between two classes in the pair since the sample distribution in both classes was the same. And I expected that the network would classify the samples from similar classes with equal probability: for example, if a single class from the previous setup was guessed with 98% accuracy, then in the new dataset the correct guesses would be 49% vs 49%, or at least without landslide difference.
But classifier preferred to classify samples from both subsets with one class, like 70% vs 30%.
I understand that I deal with bias but I wonder if this situation is normal?
The reason why I was doing so is my experiments with metadata, which I supply to the semantic segmentation network via separate input and this metadata is a hint about the samples domain.
I’m convinced that metadata becomes important when the network cannot guess from the source sample to which domain it belongs. I quantify my assumptions on how easily the network can guess the domain with the ResNet classifier.
And I’m a bit uncomfortable that the classifier prefers one class when it cannot distinguish between two (or more…) classes.
I might be wrong and can reconcile with this bias if somebody from this respectable community can confirm this or point to some published discussion on the topic.
I prepared a test case for illustration:
Here I overrode the CIFAR10 dataloader get_item method and when it takes an image of class 8 (ship), it can randomly make it a new class 11.
Similarly, the sample of class 9 ‘truck’ can be left the same or be assigned a new class 10 ‘truk_m’ (‘modified truck’).
The simple classifier from the PyTorch tutorial after training doesn’t distribute predictions equally between similar classes but usually prefers to predict one class more often (the confusion matrix at the bottom).