Prediction top 5 probabilities erroring in size mismatch

Deep_Learner · February 21, 2019, 9:41am

I am using VGG16 model for an image classifer project I have trained, validated and tested the following model successfully. I have written a preprocessing Image function so that it can be used as input for the model. That works fine. When I started working on prediction class, to predict the top 5 most probable class, I am getting runtime error size mismatch error. RuntimeError: size mismatch, m1: [1 x 49152], m2: [25088 x 4096] . I am not sure where I am going wrong. Please find all the details below.

Model Details
model = models.vgg16(pretrained=True)

for param in model.parameters():
param.requires_grad = False
classifier = nn.Sequential(OrderedDict([
(‘fc1’,nn.Linear(25088, 4096)),
(‘relu’,nn.ReLU()),
(‘drop’,nn.Dropout(.5)),
(‘fc2’,nn.Linear(4096,102)),
(‘output’,nn.LogSoftmax(dim=1))
]))

model.classifier = classifier

I have writen the following function that preprocesses the image so it can be used as input for the model.
def process_image(image_path):
‘’’ Scales, crops, and normalizes a PIL image for a PyTorch model,
returns an Numpy array
‘’’
# Open the image
from PIL import Image
img = Image.open(image_path)

if img.size[0] > img.size[1]:
    img.thumbnail((img.size[0], 256))
else:
    img.thumbnail((256, img.size[1]))
    
#crop out 224*224
left = (img.width - 224)/2
top = (img.height - 224)/2
right = (img.width + 224)/2
bottom = (img.height + 224)/2

img.crop((left, top, right, bottom))

# Normalize
img = np.array(img)/255
mean = np.array([0.485, 0.456, 0.406]) # mean
std = np.array([0.229, 0.224, 0.225]) # std
img = (img - mean)/std

# Move color channels to first dimension as expected by PyTorch
img = img.transpose((2, 0, 1))

return img

Class Prediction
def predict(image_path, model, topk=5):
‘’’ Predict the class (or classes) of an image using a trained deep learning model.
‘’’
model.eval()
image = process_image(image_path)
if torch.cuda.is_available():
image_tensor = torch.from_numpy(img).type(torch.cuda.FloatTensor)
else:
image_tensor = torch.from_numpy(img).type(torch.FloatTensor)
#image = image.view(1,296448)
model_input = image_tensor.unsqueeze(0)
with torch.no_grad():
outputs = model(model_input)

Calling predict function
image_path = (“flowers/valid/1/image_06769.jpg”)
predict(image_path, model.to(device))

Please help.

ptrblck · February 21, 2019, 12:32pm

Could you print the shape of model_input before passing it to the model?

Deep_Learner · February 21, 2019, 4:13pm

Thanks a lot for your response. I have printed the size of model_input and it is torch.Size([1, 3, 256, 386])

ptrblck · February 21, 2019, 4:18pm

That’s most likely the error.
The VGG architecture expects an input of shape [batch_size, 3, 224, 224], which should be taken care of in your process_image method.
However, it looks like you are passing the wrong array to torch.from_numpy:

# Change this
image_tensor = torch.from_numpy(img).type(torch.cuda.FloatTensor)
# to
image_tensor = torch.from_numpy(image).type(torch.cuda.FloatTensor)

and try to run it again.

Deep_Learner · February 21, 2019, 6:00pm

Thanks a lot. You are right. the issue was in process_image method. I have now resolved the issue and it is working. Thanks again