I’m trying to use the pretrained vgg16 model provided by torchvision.
But it seems that the expected input image is not a 3224224, as I got the following?
RuntimeError: Expected 4-dimensional input for 4-dimensional weight 64 3 3 3, but got 3-dimensional input of size [3, 224, 224] instead
import torch.nn as nn
from torchvision import models
original_model = models.vgg16(pretrained=True)
ran = torch.rand((3, 224,224))
The input is expected to have the shape
[batch_size, channels, height, width]
You would have to add a batch dimension in dim0 via:
ran = ran.unsqueeze(0)
or initialize the batch dimension directly via:
ran = torch.rand((1, 3, 224, 224))
Thank you, this works. However, why does the error message say 64 3 3 3 ? instead of 1,3,224,224?
oh, I got it. it’s 4 dimensional weights …
This is the kernel shape of the conv layer (64 filters, 3 input channels, 3x3 spatial size).
The error message mentions the 3-dimensional input as:
but got 3-dimensional input of size [3, 224, 224] instead
EDIT: I was a second too late