InfT
(Inf)
July 22, 2021, 9:37am
1
Good morning. I’ve a dataset with monochromatic images with 4096x256 pixel and I would like to split them in many patches of size 16x16. Every image is a tensor:
torch.Size([1, 256, 4096])
How can I do?
eqy
July 22, 2021, 8:02pm
2
I believe this is a classic use case where Tensor.unfold can be used:
For the second use case you could use Tensor.unfold:
S = 128 # channel dim
W = 256 # width
H = 256 # height
batch_size = 10
x = torch.randn(batch_size, S, W, H)
size = 64 # patch size
stride = 64 # patch stride
patches = x.unfold(1, size, stride).unfold(2, size, stride).unfold(3, size, stride)
print(patches.shape)
> torch.Size([10, 2, 4, 4, 64, 64, 64])
patches now containes [2, 4, 4] patches of size [64, 64, 64].
For the first use case:
You could use pooling operators like nn.MaxPool3d an…