Finetuning Torchvision Models Predict?

On the finetuning example, we’re only shown how to fine tune the model with train data and validation data, how do we test a new data?

How do I run something along the line of

model.predict('testdata.jpg') that will return the predicted class?

If you have a folder containing the test data, the easiest approach would be to create a new Dataset and transformation using the validation implementations as a template.
However, if you just would like to classify a single sample, you could just load the image, preprocess it, and pass it to your model. Here is some pseudo-code:

img = Image.open(PATH)
x_test = data_transforms['val'](img)
x_test.unsqueeze_(0)  # Add batch dimension
output = model(x_test)
pred = torch.argmax(output, 1)
1 Like

I encountered an error and with a little googling, just in case anyone also find this problem,
I’ve changed the code into this because the model is on GPU :smiley:
thank you! my single prediction image worked!

x_test = data_transforms['val'](img).to(device)

Now when I want to predict an entire folder, what’s the pseudocode?

Basically you would have to add a test set in the Load Data section of the tutorial, e.g.:

data_transforms = {
    'train': transforms.Compose([
        transforms.RandomResizedCrop(224),
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ]),
    'val': transforms.Compose([
        transforms.Resize(256),
        transforms.CenterCrop(224),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ]),
    'test': transforms.Compose([
        transforms.Resize(256),
        transforms.CenterCrop(224),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ]),
}
...
image_datasets = {x: datasets.ImageFolder(os.path.join(data_dir, x),
                                          data_transforms[x])
                  for x in ['train', 'val', 'test']}

Just make sure you have a test folder in the corresponding data folder.

Also, add another phase in the training loop:

for phase in ['train', 'val', 'test']:
1 Like

thanks again!!! I did it.

Now I have more question though, I hope you don’t mind.

Now is there a way to output a few images of test classes and show the prediction ? Maybe also show the percentage of the model prediction for each class?

How can I showcase the predictions that is the most right, and the most wrong?

You could adapt this code to visualize some test images.
To calculate the per-class accuracy, you could use some code of this tutorial.

What do you mean by “predictions that is the most right, and the most wrong”?
Would you like to get the test image with the highest and lowest score for a specific class?

1 Like

I would like to have the image that the model is very confident to predict right, and is right, and the image that the model is very confident to predict wrong, but actually right

edit

what is the equivalent of this code on here

testset = torchvision.datasets.CIFAR10(root='./data', train=False,
                                       download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=4,
                                         shuffle=False, num_workers=2)

if I use ImageFolder class in this here and this is my modification of it

data_transforms = {
    'train': transforms.Compose([
        transforms.RandomResizedCrop(input_size),
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ]),
    'val': transforms.Compose([
        transforms.Resize(input_size),
        transforms.CenterCrop(input_size),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ]),
    'test': transforms.Compose([
        transforms.Resize(input_size),
        transforms.CenterCrop(input_size),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ]),
}



# Create training and validation datasets
image_datasets = {x: datasets.ImageFolder(os.path.join(data_dir, x), data_transforms[x]) for x in ['train', 'val', 'test']}
# Create training and validation dataloaders
dataloaders_dict = {x: torch.utils.data.DataLoader(image_datasets[x], batch_size=batch_size, shuffle=True, num_workers=4) for x in ['train', 'val', 'test']}

To get the highest prediction for a wrong class, you could try to adapt this code and add it to train_model:

max_wrong_logit = -1000.
max_wrong_class = -1
max_wrong_index = -1

for batch_idx, (inputs, labels) in enumerate(dataloaders[phase]):
    # statistics
    running_loss += loss.item() * inputs.size(0)
    running_corrects += torch.sum(preds == labels.data)
    ...

    if phase == 'val':
        wrong_idx = preds != labels
        wrong_logit = torch.max(outputs[wrong_idx])
        if max_wrong_logit < wrong_logit:
            idx = (outputs == wrong_logit).nonzero()
            image_idx = batch_idx * dataloaders[phase].batch_size + idx[0, 0].item()
            pred_class = class_names[idx[0, 1].item()]
            
            max_wrong_logit = wrong_logit
            max_wrong_class = pred_class
            max_wrong_index = image_idx

This will store the image index along with the logit and class name.
The same should also work for the best prediction if you just change the index calculation to preds == labels.

To get a validation set using CIFAR10, you could use a SubsetRandomSampler. @kevinzakka created a nice example here.

2 Likes

thank you time and time again!

You just singlehandedly saved my recognition system class :smiley:

I got another problem though, this is a different datasets than my previous question,

when I try to classify a class that has >6 classes, this datasets has 70, I get error

RuntimeError: CUDA error: device-side assert triggered

The dataset is small, only about 1.7gb, my VRAM is 6gb, I’m using gtx 1060, so I think it’s not a problem in my GPU right? What’s wrong?

Could you try to run your model on the CPU first and see if you get any errors?
Alternatively, run your script with CUDA_LAUNCH_BLOCKING=1 python script.py args and post the stack trace here.

1 Like
RuntimeError: Assertion `cur_target &gt;= 0 &amp;&amp; cur_target &lt; n_classes' failed. at /opt/conda/conda-bld/pytorch-nightly_1542967201177/work/aten/src/THNN/generic/ClassNLLCriterion.c:93

This is what I get if I set my device to “CPU”

The error points to the target which seems to have invalid indices.
Make sure the target only contains long values in the range [0, number_of_classes - 1].

1 Like

how can I make sure of that?

Why is it that when I have 6 classes, and I put the number of class to be 6, it’s working fine.

But if I have 70 class, and I put the number of class to be 70, it’s not?

I actually got it working by adding +1 to the number of classes, but this doesn’t seem right because the accuraccy is so low

It should work with any number of classes, as long as the target indices match your model output.
Note that the indices start from 0 and go to (number_of_classes - 1).
If you have 70 classes, your target should contain values in [0, 69].

1 Like

so the number of class I should fill is 69?

I’m not sure what you mean.
Your target should contain long values with class indices between 0 and 69:

target = torch.tensor([0, 55, 32, 69, 10, 3, 4])  # Sample target for a batch size of 7

invalid_target = torch.tensor([-1, 70, 71, 110])  # Invalid, since indices are not in range [0, 69]
1 Like
#   to the ImageFolder structure
data_dir = "./plane"

# Models to choose from [resnet, alexnet, vgg, squeezenet, densenet, inception]
model_name = "resnet"

# Number of classes in the dataset
num_classes = 71

# Batch size for training (change depending on how much memory you have)
batch_size = 10

# Number of epochs to train for 
num_epochs = 3

# Flag for feature extracting. When False, we finetune the whole model, 
#   when True we only update the reshaped layer params
feature_extract = True

# Detect if we have a GPU available
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
# device = "cpu"

What i mean is the number of class in this code, if I put 70, or 69, it gives me that error, but if I put 71 there, it works somehow, but I’m not sure the model is classifying correctly because the accuracy is low.

On the other hand, I have no idea how can I tell that my target contain indices between 0 and 69?

Where are you using num_classes in the code further?

It depends, how the target is created. If you use ImageFolder, the target will be automatically created using all subfolders from the root you provided. However, you could easily define the target yourself.
I’m currently not sure, what kind of code you are using.

1 Like

I’m using the Finetuning Torchvision Models as a template, I hope this google colab notebook can be of help
https://colab.research.google.com/drive/175uIbG6u6DNdMmbbmfI8KW6R-HBm7hGf

The folder structure is
plane
–train
----plane1
----…
----plane70
–val
----plane1
----…
----plane70

Thanks for the code!
It looks fine, since the number of classes should be 70. At least that’s what the output is saying.

Could you set num_classes=70 again and print the target tensor in your training loop?
If you run the script again, we can see, what’s in the current target before the error occurs.

1 Like