I’m experimenting a very simple image classification task: classify chest x-ray to either normal, bacterial infected or virus infected.
Please see my github: GitHub - kail85/SimpleTorchClassification
My training script is in Train.ipynb. The data are simple gray scale image of the size 256 x 256, and being spitted to train, val and test sets. In each set, the ratio of the 3 image categories remains the same.
I wrote the script in both torch and Matlab. Using mobilenetv2, Matlab stops at epoch 9 with a final validation loss 0.1714 and accuracy 82.91%.
While for pytorch with the exact same training setting, the validation loss never drops below 1.0 and there’s a severe overfitting.
I’m not sure what happened, is there anything wrong with my validation script? I tried mobilenetv3_small, mobilenetv3_large, and resnet18, they all have the same issue.
Only alexnet can reach a validation loss at about 0.4 at epoch 20.
I’ve been stuck here for a few days. I’m appreciated any suggestion. Thank you.
Update: I put a break point in the validation function, at the line
outputs_val = net(inputs_val)
its output is
tensor([[ 1.1345, -1.1818, 0.0358],
[ 1.1345, -1.1818, 0.0358],
[ 1.1345, -1.1818, 0.0358],
[ 1.1345, -1.1818, 0.0358],
[ 1.1345, -1.1818, 0.0358],
[ 1.1345, -1.1818, 0.0358],
[ 1.1345, -1.1818, 0.0358],...
Then I tried
net(torch.zeros_like(inputs_val))
it gives
tensor([[ 1.1345, -1.1818, 0.0358],
[ 1.1345, -1.1818, 0.0358],
[ 1.1345, -1.1818, 0.0358],
[ 1.1345, -1.1818, 0.0358],
[ 1.1345, -1.1818, 0.0358],
[ 1.1345, -1.1818, 0.0358],...
I’m about to cry…