I am dividing an image into patches with the function below:
def extract_image_patches(x, kernel, stride=16, dilation=1):
# Do TF 'SAME' Padding
b, c, h, w = x.shape
h2 = math.ceil(h / stride)
w2 = math.ceil(w / stride)
pad_row = (h2 - 1) * stride + (kernel - 1) * dilation + 1 - h
pad_col = (w2 - 1) * stride + (kernel - 1) * dilation + 1 - w
x = F.pad(x, (pad_row // 2, pad_row - pad_row // 2, pad_col // 2, pad_col - pad_col // 2))
# Extract patches
patches = x.unfold(2, kernel, stride).unfold(3, kernel, stride)
patches = patches.permute(0, 4, 5, 1, 2, 3).contiguous()
return patches.view(b, -1, patches.shape[-2], patches.shape[-1])
For a 224 x 224
image, the above function returns a tensor of shape (1, 768, 14, 14)
. 768 = (3 x 16 x 16)
for one patch and (14, 14)
is the height and width. Meaning there are 14
rows and columns of these patches.
Is there a way to change the pixel values of one patch with another function? Below is what I’ve tried so far
def shade_patch(patch_list, image):
batch = len(patch_list)
patches = extract_image_patches(image.unsqueeze(0), 16)
pa = patches.repeat(batch, 1, 1, 1)
count = 0
for x in patch_list:
my_patches = np.split(x, len(x)/5
for patch in my_patches:
x_pos, y_pos, r, g, b = patch
pa[count, 0, x_pos, y_pos] = (r / 255.0 - 0.4914) / 0.2023
pa[count, 1, x_pos, y_pos] = (g / 255.0 - 0.4822) / 0.1994
pa[count, 2, x_pos, y_pos] = (b / 255.0 - 0.4465) / 0.2010
count += 1
return pa
Where patch_list
is a numpy array of 5 elements ([x, y, r, g, b])
. x
and y
are the positions of the patch to shade. The positions would range from 0
to 13
in my case since I have 14 x 14
patches. (r, g, b)
would be the color channels
So if I expect the patch at cordinate (3, 5)
to have been shaded with (0, 121, 256)
color combination if I do the following:
x = np.array([3, 5, 0, 121, 256])
s = shade_patch(x, img)