Expected stride to be a single integer value or a list?

I am using ResNet18 to do some transfer learning for a project. I have been able to follow the Transfer Learning Tutorial without any real issue (I am using a feed forward DNN for ‘fc’ with a final layer that classifies 102 categoires instead of 1000).

However, after my model is trained, I of course want to perform inference on some test data (images). The problem is when I try to do something like this:

# img is a [3,224,224] Tensor just like all the other images during training
img = process_image(image_path)
model.eval()
with torch.no_grad():
      outputs = model(img) # Note that wrapping this in a Variable() results in the samething

It throws a somewhat cryptic exception:

RuntimeError: expected stride to be a single integer value or a list of 1 values to match the convolution dimensions, but got stride=[2, 2]

Does anyone know what I’m doing wrong or how I can perform inference using ResNet18?

3 Likes

From your code sample it looks like the batch dimension is missing.
Try to add img.unsqueeze_(0) to your code before passing the image to the model.

22 Likes

That worked! Wow! So can you explain to me the “unsqueeze_” part? Sorry, I’m new to PyTorch.

tensor.unsqueeze_() adds a dimension of size one at the specified position.
If you pass the argument dim=0 to the function, it will add a new dimension at position 0.
Here is a smal example:

x = torch.randn(10, 10)
y = torch.unsqueeze(x, 0)
print(y.shape)
> torch.Size([1, 10, 10])

As you can see y now has an additional dimension.
You can also use the inplace method on x directly. Inplace methods are used with an underscore at the end of the function:

x = torch.randn(10, 10)
x.unsqueeze_(0)
print(x.shape)
> torch.Size([1, 10, 10])

In your example, you tried to feed a single image into your model.
The model (or Modules in general) need input with a batch dimension at position 0.
If you use a mini batch of images, it should work:

x = torch.randn(10, 3, 224, 224)
model = nn.Conv2d(3, 6, 3, 1, 1)
output = model(x)

However, if you just load a single image, you have to add the batch dimension, if it’s not already there.
Have a look at the following example:

x_single = torch.randn(3, 224, 224)
output = model(x_single) # RuntimeError

x_single.unsqueeze_(0)
print(x_single.shape)
output = model(x_single) # Works now!
10 Likes

@ptrblck Thank you so you much. I wish the exception thrown was a bit more reflective of “you need to add a batch size” or something like that. But I get it.

1 Like

Yeah, you are right. The error is quite misleading.

The problem was I was so focused on making sure my image was properly tensorized I didn’t think to actually look at the API and recognize it wants a batch size (would be nice too if the API defaulted to a batch size of one for inference too - I don’t know, I think forcing the squeeze call is a bit strange).

Anyway, thanks @ptrbick. I’m getting it, really… :slight_smile:

Could you tell me, which PyTorch version you are using?
You can see it with: print(torch.__version__).

The answer to your question (for forum word minimums):

0.4.0

Hi
I have used just that and printed the result

Image type: torch.FloatTensor
shape without unsqueezing: torch.Size([3, 10, 10])
After unsqueezing: torch.Size([1, 3, 10, 10])

I was using VGG16 to detect person face jpg , as
Before unsqueezing it was giving error like

RuntimeError: expected stride to be a single integer value or a list of 1 values to match the convolution dimensions, but got stride=[1, 1]

after using unsqueezeed image to model as
vgg16(img.unsqueeze_(0))

again thrown error like this

RuntimeError: Given input size: (512x1x1). Calculated output size: (512x0x0). Output size is too small at /opt/conda/conda-bld/pytorch_1524584710464/work/aten/src/THNN/generic/SpatialDilatedMaxPooling.c:67

My code snippet as follow

Summary
    img_to_tensor =  transforms.Compose([
        transforms.CenterCrop(10),
        transforms.ToTensor(),
        transforms.Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225))
    ])
    img_pil = Image.open(img_path)
    img = img_to_tensor(img_pil)
#     print("Image type: ", img.type())
#     print("shape without unsqueezing: ", img.shape)
#     print("After unsqueezing: ",img.unsqueeze_(0).shape)
    prediction = VGG16(img.unsqueeze_(0))
    ## Return the *index* of the predicted class for that image
    

    return prediction # predicted class index

Please Help!!! .

@ptrblck , I have found solution. Sorry, if i mentioned same error. :slight_smile: .
but only changing to img.unsqueeze_(0) to img.unsqueeze(0) , vanishes all errors.

That’s strange, as the last error points to a too small spatial size of your input.
I.e. if you are using convs and pooling layers, which decrease your spatial size, you might end up with an empty tensor at one point, which will raise the issue.

But suddenly img.unsqueeze_(0) worked, just i tried line. Things are not getting in my head, probably because of my neurons are not trained yet in leanring pytorch.

I was using VGG16 model to predict dogs as assignments. I just tried both it all worked.

Summary

def VGG16_predict(img_path):
‘’’
Use pre-trained VGG-16 model to obtain index corresponding to
predicted ImageNet class for image at specified path

Args:
    img_path: path to an image
    
Returns:
    Index corresponding to VGG-16 model's prediction
'''

## TODO: Complete the function.
## Load and pre-process an image from the given img_path
img_to_tensor =  transforms.Compose([
    transforms.CenterCrop(224),
    transforms.ToTensor(),
    transforms.Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225))
])
img_pil = Image.open(img_path)
img = img_to_tensor(img_pil)
print("Image type: ", img.type())
print("shape without unsqueezing: ", img.shape)
print("After unsqueezing: ",img.unsqueeze_(0).shape)
prediction = VGG16(img.unsqueeze_(0))
## Return the *index* of the predicted class for that image


return prediction.data.numpy().argmax() # predicted class index

Oh sorry, I found tha problem happend previously that i was using
transforms.CenterCrop(40) previously , thats why it was giving errr.
When i change to transforms.CenterCrop(224) , It all worked

Thank you for helping :slight_smile:

Hi
class gdl3d_loss(nn.Module):

def __init__(self, pNorm=2):

    super(gdl3d_loss, self).__init__()

    self.convX = nn.Conv3d(1, 1, kernel_size=(3,3), stride=1, padding=1, bias=False)

    self.convY = nn.Conv3d(1, 1, kernel_size=(3,3), stride=1 ,padding=1, bias=False)

    self.convZ = nn.Conv3d(1, 1, kernel_size=(3,3), stride=1, padding=1 ,bias=False)

filterX = torch.FloatTensor([[[[-1, 1]]]]) # 1x2

    filter_x=np.array([[1, 0, -1],[2,0,-2],[1,0,-1]])

    print(filter_x.shape)

    print(filter_x.size)

    self.convX.weight=nn.Parameter(torch.from_numpy(filter_x).float().unsqueeze(0).unsqueeze(0))

    filter_y=np.array([[1, 2, 1],[0,0,0],[-1,-2,-1]])

    self.convY.weight=nn.Parameter(torch.from_numpy(filter_y).float().unsqueeze(0).unsqueeze(0))

    filter_z=np.array([[-1, -2, 1],[0,0,0],[1,2,1]])

    self.convZ.weight=nn.Parameter(torch.from_numpy(filter_z).float().unsqueeze(0).unsqueeze(0))

    self.pNorm = pNorm

def forward(self, pred, gt):

    assert not gt.requires_grad

    assert pred.size() == 5

    assert gt.size() == 5

    assert pred.size() == gt.size(), "{0} vs {1} ".format(pred.size(), gt.size())

    print(pred.shape)

    print(gt.shape)

    pred_dx = torch.abs(self.convX(pred))

    pred_dy = torch.abs(self.convY(pred))

    pred_dz = torch.abs(self.convZ(pred))

    gt_dx = torch.abs(self.convX(gt))

    gt_dy = torch.abs(self.convY(gt))

    gt_dz = torch.abs(self.convZ(gt))

    

    grad_diff_x = torch.abs(gt_dx - pred_dx)

    grad_diff_y = torch.abs(gt_dy - pred_dy)

    grad_diff_z = torch.abs(gt_dz - pred_dz)

    

    mat_loss_x = grad_diff_x ** self.pNorm

    

    mat_loss_y = grad_diff_y ** self.pNorm  # Batch x Channel x width x height

    mat_loss_z = grad_diff_z ** self.pNorm

    shape = gt.shape

    mean_loss = (torch.sum(mat_loss_x) + torch.sum(mat_loss_y)+torch.sum(mat_loss_z)) / (shape[0] * shape[1] * shape[2] * shape[3]) 

          
    return mean_loss

I have this class but I am getting error when I run gdl3d(outputG,ct_train_batch)
The torch dimensions are (10,1,64,64,64) I am getting following error
RuntimeError: expected stride to be a single integer value or a list of 2 values to match the convolution dimensions, but got stride=[1, 1, 1]

Could you please help me with that

The error message is a bit misleading, but points to the wrong definition of your kernel size.
Pass the size as a tuple of 3 values kernel_size=(3, 3, 3) or as a single value kernel_size=3 and it should work.

Thank you so much , I fixed that , but I still getting same error , do you think the conv.weight size might be wrong ? I just wrote this function gradient loss function for medical imaging in order to get less blurry image ( I am generating CT from MRI)
I appreciate if you help me

How did you fix it, if you are getting the same error? :slight_smile:
This code shows the issue and the proposed solution:


x = torch.randn(1, 1, 24, 24, 24)
conv = nn.Conv3d(1, 1, kernel_size=(3,3), stride=1, padding=1, bias=False)
out = conv(x) # breaks
conv = nn.Conv3d(1, 1, kernel_size=(3, 3, 3), stride=1, padding=1, bias=False)
out = conv(x) # works

Hi I mean yes I changed it to (3,3,3) and I was still getting same error, I guess it due to conv.weights dimension
Thnaks

In that case, could you post a minimal, executable code snippet, which yields this error so that we could debug it, please?

1 Like