Image convolution return wrong output dimension

Hello, I’m new to Pytorch. I’m tring to convert a code that use functions from scipy and numpy library in Pytorch in order to build a NN and execute it on the GPU.

I have some convolution layers that perform the convolution between a gaussian filter and an image. Exploiting the separability of the gaussian filters I perform the convolution along the x-axis and then on the y-axis.

The code without PyTorch is:

from scipy.ndimage.filters import convolve

img_convolved = convolve(convolve(img, gy[:, None]), gx[None, :])
# default padding mode is 'reflect'

where gx and gy are the 1D gaussian filters.

Mine “translation” to PyTorch is:

import torch
import torch.nn.functional as F
import numpy as np

img = np.random.rand(512, 512) # random image with shape
pad_w, pad_h = int(np.ceil(img.shape[0] / gx.size()[0])), int(np.ceil(img.shape[1] / gx.size()[0]))

img = torch.from_numpy(img).repeat(1, 1, 1, 1)
gx, gy = torch.flip(gx, (0,)).type(torch.float32), \
         torch.flip(gy, (0,)).type(torch.float32)
gx, gy = gx.repeat(1, 1, 1, 1), gy.repeat(1, 1, 1).unsqueeze(3)
img = F.pad(img, [0, 0, pad_w, pad_h], 'reflect')

convY = F.conv2d(img.float(), gy, stride=1)
convX = F.conv2d(convY, gx, stride=1)

To notice that the dimension of the gaussian filters change based on the value of the sigma chosen (this will be an hyperparameter to optimize later). Some examples are 47,5,4,…

My problems here are:

  1. The final output of the convolution (convX) has the shape that depends on the filters shape. What I want is an image with the same shape of the original image (512,512). This is the output of the scipy convolve method, but not the one of the PyTorch convolution.

  2. Will I be able to run this on the GPU building a Convolutional layer?

Hi,

The convolution will not ensure that the output is of the same size indeed.
But if you use stride=1, no dilation and your kernel has an odd size, you can set the padding to be floor(kern/2) to make sure that the output will have the same size as the input.

And yes these ops are all implemented on GPU so you will be able tto use it.

Hi, thank you for the answer.
The result of the first convolution has the desired shape, but the last one (along x-axis) change the dimension of the columns.
Any ideas?
These are my new padding dimensions:

pad_w, pad_h = int(np.floor(gx.size()[0] / 2)), int(np.floor(gx.size()[0] / 2))

I simplified your code below:

import torch
import torch.nn.functional as F
import numpy as np

img = torch.rand(1, 1, 512, 512)

kx, ky = 5, 5

# Goes from 1 channel to 1 channel
weight_h = torch.rand(1, 1, ky, 1)
weight_w = torch.rand(1, 1, 1, kx)

padx = kx // 2
pady = ky // 2

convY = F.conv2d(img, weight_h, padding=(pady, 0))
convX = F.conv2d(convY, weight_w, padding=(0, padx))

print(img.size())
print(convY.size())
print(convX.size())

Thank you very much!! You saved me a lot of time.
By the way I’ve noticed that plotting the resulting image the last column and the first one are cut. I can’t understand why.

Edit: Ok I figured out the bug. It was in another part of the code.

1 Like