Value in output CNN network and transfer learning

Korsarz · April 26, 2020, 11:17am

Hi,
I make CNN network based on tutorial: How We Can Give Our Computers Eyes and Ears, because this topic resolve similar problem - in my program I try recognize people by ear picture from camera (I use OpenCV).
So, I have first demo version, and I have a few questions about it:

- In this tutorial was used transform:

transforms.Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])

if I use data from resnet I must transform frame by this ?

mean = [0.485, 0.456, 0.406] and std = [0.229, 0.224, 0.225]

-This is my prediction function. In output (score variable) I have other value between (~ -6: ~6)(I’m not sure)

def argmax(prediction):
    prediction = prediction.cpu()
    prediction = prediction.detach().numpy()
    top_1 = np.argmax(prediction, axis=1)
    score = np.amax(prediction)
    score = '{:6f}'.format(score)
    prediction = top_1[0]
    result = dog_breeds[prediction]

    return result,score

I expected value between (0,1), because in many examples, people use if(score >0.5) to determine if the prediction is good.

In tutorial the author uses transfer learning. After he don’t use his database, only database from resnet, so if I use pretrain network I don’t have to have own database? I don’t see the place in program where he connects his database and database from resnet. I don’t understand this.

Thanks for all answers !
Have a nice day

ptrblck · April 26, 2020, 11:43pm

If you are using a pretrained model, I would recommend to use the same preprocessing as was used to train this model.
In your case, you should use the second normalization approach.
Also, you might change the preprocessing but would probably need to fine tune the complete model.
As always, it depends on the use case and you would have to try out different approaches.

It seems you are working on a multi-class classification (with probably nn.CrossEntropyLoss as the criterion). If that’s the case, then your model will output class logits, which are not limited to the range [0, 1]. To get the predicted class, the argmax usage is correct.
If you want to see the probabilities for e.g. debugging, you could apply softmax on your model output.
Note that you should not pass the softmax(output) to nn.CrossEntropyLoss, as internally F.log_softmax will be applied!

I see a custom Dataset definition in the tutorial and it’s being used in the training loop via:

for i, (inputs, labels) in enumerate(dataloaders[phase]):
    ...