[Info Need] How resize 3D volumetric data?

For 3D volumetric data, how we can resample (upsampling and downsampling) in pytorch?

For example:

[ format: (bs, h, w, depth, channel) ]
original_input = (None, 256, 256, 40, 3)

upsample_output = (None, 512, 512, 80, 3)
dnsample_output = (None, 128, 128, 20, 3)

I’m trying to use it for 3D CT/MRI data. so, upsampling or downsampling methods on the depth axis should be reasonable.

You can use torch.nn.Upsample.

But your data has to be in one of the following formats

Here is the documentation. Here you can see HOW you want to do the interpolation, since it will change if you use nearest, bilinear or something else.

Hope this helps :smile:

Thanks. Seeing the layer name, is it also possible to downsample the input with this upsample layer?

Yes, you can also specify the size that you want instead of a scaling factor

# Example
a = torch.rand(3, 3, 40, 256, 256)
up = torch.nn.Upsample(size=(80, 512, 512))
down = torch.nn.Upsample(size=(20, 128, 128))

print('down = ', down(a).shape)
print('up = ', up(a).shape)

# Output
# up =  torch.Size([3, 3, 80, 512, 512])
# down =  torch.Size([3, 3, 20, 128, 128])
1 Like

Thanks, @Matias_Vasquez. I think the layer name should be torch.nn.Resample.

I like to know how torch.nn.Upsample works for downsampling. In the depth part of volumetric data, it might be hard to decide the appropriate strategy to drop the slices depending on the domain.

For example, in medical data, if we drop the slice blindly, we might lose information. I mean, in CT/MRI images, most of the slice information appears mainly in the middle range of depth.

a = torch.rand(3, 3, 40, 256, 256)

down = torch.nn.Upsample(size=(20, 128, 128))
torch.Size([3, 3, 20, 128, 128])

So, how does 40 becomes 20. Which slices are dropped?

D(out) = D(in) * scale_factor ... ?

This is what you need to decide when you give the mode parameter. The default value (if you do not specify) is nearest.

For the case where you divide the size by 2, means that every even (or uneven) row/column will be dropped and the value will be taken from the nearest values. Which would be the actual value of the row/column that is not deleted. If you choose something like bilinear then a line will be computed from voxel to voxel and the value in between will be chosen.

However, if you say that the most important information is in the middle, then you can slice your data first. This way you will select the parts that matter the most to you and retaining more valuable information.

After slicing you can then still downsample, but less data will be lost in order to get the same size.

Since the scaling factor will be smaller (not dividing by 2 but 1.5 for example) then the values will be computed depending on the mode selected.

1 Like