How to efficiently subsample from large images

I’ve figured out how to reconstruct the patches after model prediction with the code below. I’m not sure if the padding calculations were correct in the linked post, but they were giving me half of remainder for padding, rather than the amount to pad to a multiple of 160.

x0 = torch.randn(1460, 1936)  # h, w
print(x0.shape)
# kernel size
k = 160 
# stride
d = 160
#hpadding
hpad = (k-x0.size(0)%k) // 2 
#wpadding
wpad = (k-x0.size(1)%k) // 2 
#pad x0
x = F.pad(x0,(wpad,wpad,hpad,hpad)) 
print(x.shape)
#unfold
patches = x.unfold(0, k, d).unfold(1, k, d) 
unfold_shape = patches.size()
#reshape to (batch,1,h,w)
patches = patches.contiguous().view(-1, 1,k, k)
print(patches.shape)
#create storage tensor
temp = torch.empty(patches.shape) 

#loop over all patches, feed model predictions back into storage tensor
for i,patch in tqdm(enumerate(patches)):
    temp[i,:,:,:] = model(patch.view(-1,1,k,k).to(device, dtype = torch.float)).cpu().detach()[0][1]

# Reshape back
patches_orig = temp.view(unfold_shape)
output_h = unfold_shape[0] * unfold_shape[2]
output_w = unfold_shape[1] * unfold_shape[3]
patches_orig = patches_orig.permute(0, 2, 1, 3).contiguous()
patches_orig = patches_orig.view(output_h, output_w)
#slice away padding
reconstructed_image = patches_orig[hpad:patches_orig.shape[0]-hpad,wpad:patches_orig.shape[1]-wpad]
print(reconstructed_image.shape)

The last thing I’m stuck on is getting the index from the batch loader to reconstruct the patched image. This post has a lot of discussion on the topic, but not any simple answers.