UNet different image size for the input

This is rather a theoretical question, but I like to know if one can create UNets for different input image sizes.

I already think I know the UNet input size may not match the output size.

That should be possible. If you stick to certain spatial sizes, e.g. powers of two, I would assume that most UNet implementations can handle the input.
However, if you would like to use arbitrary input shapes, most likely you would have to add some checks and pad/crop manually for odd sizes etc.

The input shape and output shape in the original UNet paper are not matching, if I recall it correctly, but you could implement it in a matching way.

I’ve created a simple UNet style architecture here, which could be useful as a code base to implement your model. :wink:


I tried your UNet demo, with other input size like this:

x = torch.randn(1, 3, 240, 180)
y = torch.randint(0, nb_classes, (1, 240, 180))

and got

RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 1. Got 45 and 44 in dimension 3

And I found online tips you can resize or tweak UNet.

Looks like no way around.

That way it has lot of sense to create UNets that examine the input and output size and create the architecture dynamically based on that.

Looks like UNets are really hard to work with if you have images of different size than the original for which it was created.

It has sense to create UNets that can adopt to any size of the input and output, still I haven’t saw this kind of dynamic architecture.

Is there any UNet like this, that I don’t need to modify the structure for specific input and output size I have?

Not sure, if there are any public implementations, but you could most likely add cropping or padding with some if/else conditions to use variable input shapes.

1 Like