Upsampling odd pixel numbers

FloHofstetter · March 31, 2021, 11:37pm

Hello all,

I am implementing a fully CNN for image processing and I am stuck at one point:
When I want my down sampling layers for a constant resolution I reach the point where max pooling results in a number of pixels that is not divisible by 2. So it is rounded off. What is the correct way to up sample at this point and reach the original resolution again?

Greetings Florian

===================================================================================================================
Layer (type:depth-idx)                   Input Shape               Output Shape              Param #
===================================================================================================================
├─Down1: 1-1                             [-1, 3, 1080, 1920]       [-1, 16, 540, 960]        --
|    └─Conv2d: 2-1                       [-1, 3, 1080, 1920]       [-1, 8, 1080, 1920]       224
|    └─ReLU: 2-2                         [-1, 8, 1080, 1920]       [-1, 8, 1080, 1920]       --
|    └─Conv2d: 2-3                       [-1, 8, 1080, 1920]       [-1, 16, 1080, 1920]      1,168
|    └─ReLU: 2-4                         [-1, 16, 1080, 1920]      [-1, 16, 1080, 1920]      --
|    └─MaxPool2d: 2-5                    [-1, 16, 1080, 1920]      [-1, 16, 540, 960]        --
├─Down2: 1-2                             [-1, 16, 540, 960]        [-1, 32, 270, 480]        --
|    └─Conv2d: 2-6                       [-1, 16, 540, 960]        [-1, 16, 540, 960]        6,416
|    └─ReLU: 2-7                         [-1, 16, 540, 960]        [-1, 16, 540, 960]        --
|    └─Dropout2d: 2-8                    [-1, 16, 540, 960]        [-1, 16, 540, 960]        --
|    └─Conv2d: 2-9                       [-1, 16, 540, 960]        [-1, 32, 540, 960]        4,640
|    └─ReLU: 2-10                        [-1, 32, 540, 960]        [-1, 32, 540, 960]        --
|    └─Dropout2d: 2-11                   [-1, 32, 540, 960]        [-1, 32, 540, 960]        --
|    └─Conv2d: 2-12                      [-1, 32, 540, 960]        [-1, 32, 540, 960]        25,632
|    └─ReLU: 2-13                        [-1, 32, 540, 960]        [-1, 32, 540, 960]        --
|    └─Dropout2d: 2-14                   [-1, 32, 540, 960]        [-1, 32, 540, 960]        --
|    └─MaxPool2d: 2-15                   [-1, 32, 540, 960]        [-1, 32, 270, 480]        --
├─Down3: 1-3                             [-1, 32, 270, 480]        [-1, 64, 135, 240]        --
|    └─Conv2d: 2-16                      [-1, 32, 270, 480]        [-1, 64, 270, 480]        18,496
|    └─ReLU: 2-17                        [-1, 64, 270, 480]        [-1, 64, 270, 480]        --
|    └─Dropout2d: 2-18                   [-1, 64, 270, 480]        [-1, 64, 270, 480]        --
|    └─Conv2d: 2-19                      [-1, 64, 270, 480]        [-1, 64, 270, 480]        102,464
|    └─ReLU: 2-20                        [-1, 64, 270, 480]        [-1, 64, 270, 480]        --
|    └─Dropout2d: 2-21                   [-1, 64, 270, 480]        [-1, 64, 270, 480]        --
|    └─MaxPool2d: 2-22                   [-1, 64, 270, 480]        [-1, 64, 135, 240]        --
├─Down4: 1-4                             [-1, 64, 135, 240]        [-1, 64, 67, 120]         --
|    └─Conv2d: 2-23                      [-1, 64, 135, 240]        [-1, 64, 135, 240]        36,928
|    └─ReLU: 2-24                        [-1, 64, 135, 240]        [-1, 64, 135, 240]        --
|    └─Dropout2d: 2-25                   [-1, 64, 135, 240]        [-1, 64, 135, 240]        --
|    └─Conv2d: 2-26                      [-1, 64, 135, 240]        [-1, 64, 135, 240]        102,464
|    └─ReLU: 2-27                        [-1, 64, 135, 240]        [-1, 64, 135, 240]        --
|    └─Dropout2d: 2-28                   [-1, 64, 135, 240]        [-1, 64, 135, 240]        --
|    └─MaxPool2d: 2-29                   [-1, 64, 135, 240]        [-1, 64, 67, 120]         --
├─Up1: 1-5                               [-1, 64, 67, 120]         [-1, 64, 134, 240]        --
|    └─Upsample: 2-30                    [-1, 64, 67, 120]         [-1, 64, 134, 240]        --
|    └─Conv2d: 2-31                      [-1, 64, 134, 240]        [-1, 64, 134, 240]        102,464
|    └─ReLU: 2-32                        [-1, 64, 134, 240]        [-1, 64, 134, 240]        --
|    └─Dropout2d: 2-33                   [-1, 64, 134, 240]        [-1, 64, 134, 240]        --
|    └─Conv2d: 2-34                      [-1, 64, 134, 240]        [-1, 64, 134, 240]        36,928
|    └─ReLU: 2-35                        [-1, 64, 134, 240]        [-1, 64, 134, 240]        --
|    └─Dropout2d: 2-36                   [-1, 64, 134, 240]        [-1, 64, 134, 240]        --
├─Up2: 1-6                               [-1, 64, 134, 240]        [-1, 64, 268, 480]        --
|    └─Upsample: 2-37                    [-1, 64, 134, 240]        [-1, 64, 268, 480]        --
|    └─Conv2d: 2-38                      [-1, 64, 268, 480]        [-1, 64, 268, 480]        102,464
|    └─ReLU: 2-39                        [-1, 64, 268, 480]        [-1, 64, 268, 480]        --
|    └─Dropout2d: 2-40                   [-1, 64, 268, 480]        [-1, 64, 268, 480]        --
|    └─Conv2d: 2-41                      [-1, 64, 268, 480]        [-1, 64, 268, 480]        36,928
|    └─ReLU: 2-42                        [-1, 64, 268, 480]        [-1, 64, 268, 480]        --
|    └─Dropout2d: 2-43                   [-1, 64, 268, 480]        [-1, 64, 268, 480]        --
├─Up3: 1-7                               [-1, 64, 268, 480]        [-1, 16, 536, 960]        --
|    └─Upsample: 2-44                    [-1, 64, 268, 480]        [-1, 64, 536, 960]        --
|    └─Conv2d: 2-45                      [-1, 64, 536, 960]        [-1, 32, 536, 960]        51,232
|    └─ReLU: 2-46                        [-1, 32, 536, 960]        [-1, 32, 536, 960]        --
|    └─Dropout2d: 2-47                   [-1, 32, 536, 960]        [-1, 32, 536, 960]        --
|    └─Conv2d: 2-48                      [-1, 32, 536, 960]        [-1, 32, 536, 960]        9,248
|    └─ReLU: 2-49                        [-1, 32, 536, 960]        [-1, 32, 536, 960]        --
|    └─Dropout2d: 2-50                   [-1, 32, 536, 960]        [-1, 32, 536, 960]        --
|    └─Conv2d: 2-51                      [-1, 32, 536, 960]        [-1, 16, 536, 960]        12,816
|    └─ReLU: 2-52                        [-1, 16, 536, 960]        [-1, 16, 536, 960]        --
|    └─Dropout2d: 2-53                   [-1, 16, 536, 960]        [-1, 16, 536, 960]        --
├─Up4: 1-8                               [-1, 16, 536, 960]        [-1, 1, 1072, 1920]       --
|    └─Upsample: 2-54                    [-1, 16, 536, 960]        [-1, 16, 1072, 1920]      --
|    └─Conv2d: 2-55                      [-1, 16, 1072, 1920]      [-1, 16, 1072, 1920]      2,320
|    └─ReLU: 2-56                        [-1, 16, 1072, 1920]      [-1, 16, 1072, 1920]      --
|    └─Conv2d: 2-57                      [-1, 16, 1072, 1920]      [-1, 1, 1072, 1920]       145
|    └─ReLU: 2-58                        [-1, 1, 1072, 1920]       [-1, 1, 1072, 1920]       --
===================================================================================================================
Total params: 652,977
Trainable params: 652,977
Non-trainable params: 0
Total mult-adds (G): 107.09
===================================================================================================================
Input size (MB): 23.73
Forward/backward pass size (MB): 1592.34
Params size (MB): 2.49
Estimated Total Size (MB): 1618.57
===================================================================================================================

Process finished with exit code 0

Edit: Architecture

aza · April 1, 2021, 1:59am

Hi Florian,
Depending on which max pool function you are using, there is a padding parameter that you can utilize – for example instantiating nn.MaxPool2d sets this padding to zero by default (MaxPool2d — PyTorch 1.8.1 documentation)
I believe setting that parameter to a value of one would do the trick

FloHofstetter · April 1, 2021, 1:17pm

Hello Ali,
thanks for the quick answer. But I think that does not solve my problem. If I choose a fixed padding layer, then this fits again only for the one special case, but not for a dynamic resolution does it?