Dear friends,
I am trying to adapt the ImageNet example to use PyTorch to train on Caltech101, an also important dataset.
I am using ImageFolder just like it is used in ImageNet example, but the model VGG from vision.models is not learning (Loss is not going to zero):
EPOCH: 4
Learning Rate: 0.1000
Namespace(batch_size=16, dataset=‘caltech101’, epochs=5, evaluate=False, learning_rate_decay_epochs=(6, 8, 9), learning_rate_decay_period=25, learning_rate_decay_rate=0.2, model=‘vgg19_bn’, momentum=0.9, original_learning_rate=0.1, print_freq=16, seed=1, weight_decay=0.0005, workers=1)
Epoch: [4][16/190] Time 0.291 (0.285) Data 0.000 (0.004) Loss 4.6316 (4.6236) Prec@1 0.000 (0.391) Prec@5 0.000 (3.125)
Epoch: [4][32/190] Time 0.285 (0.287) Data 0.000 (0.002) Loss 4.6337 (4.6283) Prec@1 0.000 (0.586) Prec@5 0.000 (3.516)
Epoch: [4][48/190] Time 0.283 (0.286) Data 0.000 (0.001) Loss 4.6657 (4.6294) Prec@1 0.000 (0.781) Prec@5 0.000 (3.385)
Epoch: [4][64/190] Time 0.283 (0.286) Data 0.000 (0.001) Loss 4.6256 (4.6309) Prec@1 0.000 (0.684) Prec@5 0.000 (3.320)
Epoch: [4][80/190] Time 0.283 (0.285) Data 0.000 (0.001) Loss 4.6329 (4.6314) Prec@1 0.000 (0.625) Prec@5 0.000 (3.281)
Epoch: [4][96/190] Time 0.283 (0.285) Data 0.000 (0.001) Loss 4.6345 (4.6329) Prec@1 0.000 (0.521) Prec@5 12.500 (3.125)
Epoch: [4][112/190] Time 0.283 (0.285) Data 0.000 (0.001) Loss 4.6323 (4.6343) Prec@1 0.000 (0.558) Prec@5 0.000 (2.958)
Epoch: [4][128/190] Time 0.283 (0.284) Data 0.000 (0.001) Loss 4.6604 (4.6355) Prec@1 0.000 (0.488) Prec@5 6.250 (3.027)
Epoch: [4][144/190] Time 0.285 (0.284) Data 0.000 (0.001) Loss 4.7083 (4.6372) Prec@1 0.000 (0.434) Prec@5 0.000 (2.908)
Epoch: [4][160/190] Time 0.287 (0.285) Data 0.000 (0.001) Loss 4.6679 (4.6370) Prec@1 6.250 (0.508) Prec@5 6.250 (3.086)
Epoch: [4][176/190] Time 0.287 (0.285) Data 0.000 (0.000) Loss 4.6311 (4.6385) Prec@1 6.250 (0.497) Prec@5 6.250 (2.947)
TRAIN@1: 0.528
The images are JPG and bellow you see my code:
train_transform = transforms.Compose([ transforms.RandomSizedCrop(224), transforms.RandomHorizontalFlip(), transforms.ToTensor(), normalize, ])
val_transform = transforms.Compose([ transforms.Scale(256), transforms.CenterCrop(224), transforms.ToTensor(), normalize, ])
Question: Do I need to explicit convert JPG to BMP in some way?
Or this should not be necessary since we are using PIL library anyway?
I guess ImageNet images are provided in BMP format, right? Should this fact make any difference?
Thanks in advance,
David