Training ImageNet- VGG16

abhyantrika · December 25, 2017, 2:50pm

I am trying to train(kind of finetuning) the VGG16 network.
I am using a custom set of CNN filters and am trying to retrain the final dense classification layer only. I am using the default settings provided in

github.com

pytorch/examples/blob/master/imagenet/main.py

import argparse
import os
import shutil
import time

import torch
import torch.nn as nn
import torch.nn.parallel
import torch.backends.cudnn as cudnn
import torch.distributed as dist
import torch.optim
import torch.utils.data
import torch.utils.data.distributed
import torchvision.transforms as transforms
import torchvision.datasets as datasets
import torchvision.models as models

model_names = sorted(name for name in models.__dict__
    if name.islower() and not name.startswith("__")
    and callable(models.__dict__[name]))

This file has been truncated. show original

Where I have modified model creation from lines 73-88, to import the model with my CNN filters. I have also made sure that “requires_grad=True” parameter is enabled only for the dense layer.

This is my result after around 22 epochs,
Epoch: [22][4510/5005] Time 1.438 (2.538) Data 0.001 (1.757) Loss 6.9070 (6.9070) Prec@1 0.000 (0.103) Prec@5 0.391 (0.520)

As you can see prec@1 is Zero!

I suspect there is something wrong in the data loading and labeling of the imagenet dataset.
I downloaded and extracted the ILSVRC2012 file to obtain train,test and eval.

The folder names inside train/ directory corresponds to some kind of synset id (“n02084071”).
I am assuming the pytorch loaders use this as the class label.

The eval/ folder does not come with the same set of folders, instead comes with a .txt file that maps the images to numerical classes (1-1000). So I wrote a script to create folders accordingly for each images using the numerical class labels, using the data/metadata.mat file (this file has all the info & minute details)

Why is my precision zero, even after 22 epochs?
Please do inform if my data processing methods are wrong or feel free to point out any other errors.
Please help !

Giocobon · March 16, 2020, 2:44pm

Hi @abhyantrika. Have you been able to solve the issue? I have the same problem. My guess is that the pretrained model uses labels in different orders than the one you have in the train/val datasets.