Torchvision pretrained vgg16 doesn't accept 224*224*3 image?

I’m trying to use the pretrained vgg16 model provided by torchvision.

But it seems that the expected input image is not a 3224224, as I got the following?

RuntimeError: Expected 4-dimensional input for 4-dimensional weight 64 3 3 3, but got 3-dimensional input of size [3, 224, 224] instead

My code:

import torch
import torch.nn as nn
from torchvision import models

original_model = models.vgg16(pretrained=True)

ran = torch.rand((3, 224,224))

original_model.features(ran)

The input is expected to have the shape [batch_size, channels, height, width]
You would have to add a batch dimension in dim0 via:

ran = ran.unsqueeze(0)

or initialize the batch dimension directly via:

ran = torch.rand((1, 3, 224, 224))

Thank you, this works. However, why does the error message say 64 3 3 3 ? instead of 1,3,224,224?

oh, I got it. it’s 4 dimensional weights …

This is the kernel shape of the conv layer (64 filters, 3 input channels, 3x3 spatial size).
The error message mentions the 3-dimensional input as:

but got 3-dimensional input of size [3, 224, 224] instead

EDIT: I was a second too late :wink: