I have a picture 100x200. I want to make it 100x100 using nn.MaxPool2d. How do I set the size of the kernel and stride correctly?
Here is an example:
import torch
img = torch.zeros((128,3,200,100))
padded_image = torch.nn.functional.pad(img, (1,0,1,0), mode='replicate')
conv = torch.nn.Conv2d(3, 64, (2,2), stride=(1,1))
mp = torch.nn.MaxPool2d((2,1))
z = mp(conv(padded_image))
print(z.shape)
In this case we pad the image a bit, and convolve over 2x2 filters and then max pool to get the 100x100 image.
You generally either want to use MaxPooling or Stride to shrink the image. Convolution can shrink the image a bit, which is why I pad it, although because of how maxpool works you don’t actually need the pad. You can also change the stride to something like (2,1) and not maxpool.
You can print the size of the tensor after any of these operations to see how it affects the size.
1 Like