Image tensor spliting

I have a dataset of 256 medical images. I am trying to classify the health lung vs diseased one. The resolution of images is too big around 5000 * 5000 of max. And not all the images are in the same size. So I planned to split the images into 256*256 pixels and create a bag with the split images and give them one label for the whole bag. then try Multiple instance learning. Can you please help me with this?


I think this post might be useful, which explains how to use unfold to create patches of the input and how to reshape them back.
Let me know, if this approach would be applicable for your use case and please let me know, if you get stuck implementing it. :wink:

Also, since you are working with medical image data, have a look at MONAI and TorchIO.
I know, that MONAI supports a sliding window inference workflow, but I’m unsure if it’s usable for training or if TorchIO supports it.

CC @MONAI, @fepegar for more information :slight_smile:

1 Like

thanks for the reply.
kc, kh, kw = 256, 256, 256# kernel size
dc, dh, dw = 256, 256, 256 # stride

        x = F.pad(x, (imgs.size(2) % kw // 2, imgs.size(2) % kw // 2,
                      imgs.size(1) % kh // 2, imgs.size(1) % kh // 2,
                      imgs.size(0) % kc // 2, imgs.size(0) % kc // 2))

        patches = x.unfold(1, kw, dw).unfold(2, kh, dh).unfold(3, kc, dc)

        patches = patches.contiguous().view(-1, kc, kh, kw)

img size = torch.Size([1, 3, 3055, 2935])

if I run the above code I am getting this error

RuntimeError: shape ‘[-1, 1, 256, 265]’ is invalid for input of size 45416448

You might need to adapt the padding for your input based on this post and make sure the output shape is as expected.

Thanks for the heads up!

TorchIO does support sliding window inference: see Patch-based pipelines.