Training on Caltech101: Not Learning

Dear friends,

I am trying to adapt the ImageNet example to use PyTorch to train on Caltech101, an also important dataset.

I am using ImageFolder just like it is used in ImageNet example, but the model VGG from vision.models is not learning (Loss is not going to zero):

EPOCH: 4
Learning Rate: 0.1000
Namespace(batch_size=16, dataset=‘caltech101’, epochs=5, evaluate=False, learning_rate_decay_epochs=(6, 8, 9), learning_rate_decay_period=25, learning_rate_decay_rate=0.2, model=‘vgg19_bn’, momentum=0.9, original_learning_rate=0.1, print_freq=16, seed=1, weight_decay=0.0005, workers=1)
Epoch: [4][16/190] Time 0.291 (0.285) Data 0.000 (0.004) Loss 4.6316 (4.6236) Prec@1 0.000 (0.391) Prec@5 0.000 (3.125)
Epoch: [4][32/190] Time 0.285 (0.287) Data 0.000 (0.002) Loss 4.6337 (4.6283) Prec@1 0.000 (0.586) Prec@5 0.000 (3.516)
Epoch: [4][48/190] Time 0.283 (0.286) Data 0.000 (0.001) Loss 4.6657 (4.6294) Prec@1 0.000 (0.781) Prec@5 0.000 (3.385)
Epoch: [4][64/190] Time 0.283 (0.286) Data 0.000 (0.001) Loss 4.6256 (4.6309) Prec@1 0.000 (0.684) Prec@5 0.000 (3.320)
Epoch: [4][80/190] Time 0.283 (0.285) Data 0.000 (0.001) Loss 4.6329 (4.6314) Prec@1 0.000 (0.625) Prec@5 0.000 (3.281)
Epoch: [4][96/190] Time 0.283 (0.285) Data 0.000 (0.001) Loss 4.6345 (4.6329) Prec@1 0.000 (0.521) Prec@5 12.500 (3.125)
Epoch: [4][112/190] Time 0.283 (0.285) Data 0.000 (0.001) Loss 4.6323 (4.6343) Prec@1 0.000 (0.558) Prec@5 0.000 (2.958)
Epoch: [4][128/190] Time 0.283 (0.284) Data 0.000 (0.001) Loss 4.6604 (4.6355) Prec@1 0.000 (0.488) Prec@5 6.250 (3.027)
Epoch: [4][144/190] Time 0.285 (0.284) Data 0.000 (0.001) Loss 4.7083 (4.6372) Prec@1 0.000 (0.434) Prec@5 0.000 (2.908)
Epoch: [4][160/190] Time 0.287 (0.285) Data 0.000 (0.001) Loss 4.6679 (4.6370) Prec@1 6.250 (0.508) Prec@5 6.250 (3.086)
Epoch: [4][176/190] Time 0.287 (0.285) Data 0.000 (0.000) Loss 4.6311 (4.6385) Prec@1 6.250 (0.497) Prec@5 6.250 (2.947)
TRAIN@1: 0.528

The images are JPG and bellow you see my code:

    train_transform = transforms.Compose([
        transforms.RandomSizedCrop(224),
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor(),
        normalize,
    ])
    val_transform = transforms.Compose([
        transforms.Scale(256),
        transforms.CenterCrop(224),
        transforms.ToTensor(),
        normalize,
    ])

Question: Do I need to explicit convert JPG to BMP in some way?

Or this should not be necessary since we are using PIL library anyway?

I guess ImageNet images are provided in BMP format, right? Should this fact make any difference?

Thanks in advance,

David

No, yo do not need to convert JPG to BMP.
Imagenet images are provided in JPG format.

Were you able to do this at the end? I am trying to do the same. Thanks.