# How to extract patches from an image

Given an image, I need to extract patches with stride of 32, after changing the smaller dimension of the image to 227.

How can I do that?

You could use `unfold` as described in this post.

I get this error:

``````RuntimeError: maximum size for tensor at dimension 3 is 3 but size is 32

``````

when I have this code :

``````S = 128 # channel dim
W = 227 # width
H = 227 # height
batch_size = 10

x = image_new.unsqueeze(0)

size = 32 # patch size
stride = 32 # patch stride
patches = x.unfold(1, size, stride).unfold(2, size, stride).unfold(3, size, stride)
print(patches.shape)
``````

The following code works for me:

``````S = 128 # channel dim
W = 227 # width
H = 227 # height
batch_size = 10

x = torch.randn(batch_size, S, H, W)

size = 32 # patch size
stride = 32 # patch stride
patches = x.unfold(1, size, stride).unfold(2, size, stride).unfold(3, size, stride)
print(patches.shape)
``````

What shape does `x` have after the `unsqueeze` operation?

I first loaded the image like so:

``````import numpy as np
import cv2

image_new=torch.from_numpy(img)
``````

and then I used the code above

and the size of the x is:

``````torch.Size([1, 576, 1024, 3])

``````

`dim3` has only a size of 3, so you cannot unfold it with a kernel size of 32.
I guess you would like to unfold only in dim1 and dim2?

Unrelated to this, but note that PyTorch expects image tensors in the shape `[batch_size, channels, height, width]`.

So the method of loading the image using opencv and then convert it to tensor is wrong? Is there another way to convert the image to tensor so that it outputs a shape with the dimensions you specified?

also when I only do dim1 and dim2 I get a size of this :

``````torch.Size([1, 18, 32, 3, 32, 32])

``````

and when I show the image using plt, there’s still something wrong

You could use `PIL` to load the image and then `torch.from_numpy` to create the tensor or alternatively use OpenCV, transform the image from BGR to RGB, and permute the dimensions via `image = image.permute(2, 0, 1)`.

The first solution gave me the same dimensions as before, also I had to use a different code like so :

``````image=torch.as_tensor(np.array(image).astype('float'))
``````

when I use the code you posted like so :

``````from PIL import Image

# open method used to open different extension image file
im = Image.open(r"cat.jpg")
image_new = torch.from_numpy(im)
``````

it gives me an error :

``````TypeError: expected np.ndarray (got JpegImageFile)
``````

So the first solution doesn’t work.

The second solution I got this error when I did permute

``````AttributeError: 'numpy.ndarray' object has no attribute 'permute'

``````

The code was like so :

``````image = cv2.imread('cat.jpg',1)
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) #transfrom image from BGR to RGB
image = image.permute(2, 0, 1)
``````

I want to make patches with size 32x32 with stride=16(overlap). How to reshape/reconstruct this patches to the original image?

Sorry for the confusion.
This should work:

``````img = PIL.Image.open(path)
x = torch.from_numpy(np.array(img))
x = x.permute(2, 0, 1)

# or
y = transforms.ToTensor()(img) # will permute and normalize to [0, 1]
``````

`nn.Unfold` and `nn.Fold` will give you the ability to recreate the input, but note that the overlapping pixels will be summed.

Now it’s giving me this error:

``````RuntimeError: maximum size for tensor at dimension 1 is 3 but size is 128
``````

and the shape of x is :

``````torch.Size([1, 3, 576, 1024])
``````

and my code is :

``````from PIL import Image

img = Image.open("cat.jpg")
x = torch.from_numpy(np.array(img))
x = x.permute(2, 0, 1)

x = x.unsqueeze(0)

size = 128 # patch size
stride = 32 # patch stride
patches = x.unfold(1, size, stride).unfold(2, size, stride).unfold(3, size, stride)
print(patches.shape)
``````

why do you have 128 channels anyways? aren’t they 3? red, green and blue?

I used your code snippet from this post. If you are not dealing with 128 channels, then you should change it.

Regarding the error message: you cannot use a kernel size of 128 for 3 channels.