Image segmentation with dynamic image resolution?

I’m trying to do “image” segmentation on gene sequences, where I convert a gene sequence into an image using a onehot encoding, which results in my “image” having a dimension (CxWxH)=(1x?x25), where ? changes all the time due to it being dependent on the protein I’m currently training on.

Now I have heard that U-nets are generally considered good for image segmentation problems so I was thinking of such a network, but I’m running into trouble with the pooling, and these dynamic image resolutions, and I’m wondering what people normally do about that issue?

Often the UNet implementation might be a bit flexible when it comes to different input shapes.
Usually resolutions using powers of 2 work the best, as the conv and pooling kernels are often adapted to these shapes.
However, if your input shapes vary a lot, you could try to use adaptive pooling layers, which will yield a specified output shape, or manually pad/crop the activations to the desired shapes.

1 Like