I’m doing an image classification problem (binary for the moment but it will scale to 3 or 4 classes later on) with InceptionV3.
When training I use the raw output (logits) of the classifier, which is a FC with 2 neurons output. My loss function is a Focal Loss, based on BCE with Logits Loss, due to high class imbalance. It seems to work pretty well, although there is some overfitting.
When testing the saved model with non-seen data, my metrics varies a lot depending on the batch size used for testing. There is clearly something wrong but I cannot find it. I have two ideas:
- I’m doing something wrong when testing the network, as it probably should have an activation function. A sigmoid or a softmax and then thresholding?
- Normalizing/Data augmentation: I’m normalizing my data based on the mean and std deviation of the whole dataset, both for training and testing. And then I perform data augmentation on the training data but not on the testing. I think this is correct. I’ve seen in this forum that the batch normalization layers may be affecting the results when testing, but it should not as I’m setting the model on eval mode.