How to decide patch_size/kernel size for nn.unfold

So,I am looking into the aspect of patchifying an image in VIT using nn.unfold

One of the parameters for unfolding is the kernel size/patch size.

As such what is the ideal value for the patch size so as to get a proper patched resutt similar to sliding windows.

When I try it on the image shown below

my result comes out as this


Clearly there is something going wrong here because the patches seem to be mixed and they don’t resmbel the original picture.If i choose a smaller patch size only the top region of the orginal image gets patched.

My code is as follows

import torch
from torch import nn
import cv2
import numpy as np
import matplotlib.pyplot as plt

class Patchify(nn.Module):
    def __init__(self, patch_size=56):
        super().__init__()
        self.p = patch_size
        self.unfold = torch.nn.Unfold(kernel_size=patch_size, stride=patch_size)

    def forward(self, x):
        # x -> B c h w
        bs, c, h, w = x.shape
        
        x = self.unfold(x)
        # x -> B (c*p*p) L
        
        # Reshaping into the shape we want
        a = x.view(bs, c, self.p, self.p, -1).permute(0, 4, 1, 2, 3)
        # a -> ( B no.of patches c p p )
        return a

#Read image
patch=Patchify(patch_size=75)
image_path='/content/0000000001.png'
img_src = image_path
image = cv2.imread(img_src)
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
image = image.astype('float32') / 255.0  # Normalize to [0, 1]
image = torch.from_numpy(image)
image = image.permute(2,0,1)
image = image.unsqueeze(0) #to add the batch dimensio

#Patchify
p=patch(image)
X=p.squeeze()
X.shape

#Display patches
from mpl_toolkits.axes_grid1 import ImageGrid
def plot_patches(tensor):
    fig = plt.figure(figsize=(16, 16))
    grid = ImageGrid(fig, 111, nrows_ncols=(8, 8), axes_pad=0.1)

    for i, ax in enumerate(grid):
        patch = tensor[i].permute(1, 2, 0).numpy() 
        ax.imshow(patch)
        ax.axis('off')

    plt.show()

plot_patches(X)

As such is it an issue with how I am viewing the result or should I try to use larger patch size.
thanks