I’d like to use the pre-trained Resnet in all of it’s glory, but I’m having a hard time finding the labels corresponding to each output. Could someone point me in the right direction?
Also, the outputs are all linear. I’m assuming the pre-trained model needs a softmax applied to the output?
First the pre-trained Resnet has been trained on ImageNet database which has 1000 categories.
So the output is a vector containing 1000 float or double scalar values. Each value represents the score of the image belonging to the category matching its index. Therefore you don’t need to apply the softmax function on the output as you’re not training the model, you’re using it to make predictions. You just look for the index that holds the max score and this one corresponds to the category predicted by the model.
Suppose we have 3 categories instead of 1000 and extension is straightforward.
Cateogories are : cat=0, dog=1, puma=2. Here the output is a 3 scalar float values vector corresponding to the score of each category. So an output of outp = [2.3, 7.22, -4.25]
means score(Image is cat) = 2.3, score(Image is dog) = 7.22 and score(Image is puma) = -4.25.
Index of the max score is 1 in outp. So the model predicts that the image is a dog.
The remaining thing is what are the category names of each one of 0, …, 999 indexes of the ImageNet. You can find them here ImageNet index - class name
Hopefully it’s clear.
I forgot to mention how to get the label predictions from the output or scores. It’s very easy. Note that output from a model is always a tensor that has a max method:
output = pretrained_resnet101(input)
predictions = output.max(1)
output has size [N x1000 x 1 x 1] if you don’t modify the Resnet and N is the batch size.
We want indexes of max scores over categories which are on the dimension with index 1 so we take the max over the dimension with index 1.
This is great. Thank you for such a thorough answer!