What we should use align_corners = False

I am very confused with this parameter in pytroch document. According to wiki
https://en.wikipedia.org/wiki/Bilinear_interpolation, the bilinear interpolation formula result is consistent with
align_corners =True. which is defatult before pytorch 0.4.0.
I want to know when should use align_corners=False ??

4 Likes

same problem here.
Someone can help?

Have you seen the note and examples under Upsample? I think they do a great job in explaining why.

Yeah, I check the example under Upsample. I don’t get it.

>>> input_3x3 = torch.zeros(3, 3).view(1, 1, 3, 3)
>>> input_3x3[:, :, :2, :2].copy_(input)
tensor([[[[ 1.,  2.],
          [ 3.,  4.]]]])
>>> input_3x3
tensor([[[[ 1.,  2.,  0.],
          [ 3.,  4.,  0.],
          [ 0.,  0.,  0.]]]])

>>> m = nn.Upsample(scale_factor=2, mode='bilinear')  # align_corners=False
>>> # Notice that values in top left corner are the same with the small input (except at boundary)
>>> m(input_3x3)
tensor([[[[ 1.0000,  1.2500,  1.7500,  1.5000,  0.5000,  0.0000],
          [ 1.5000,  1.7500,  2.2500,  1.8750,  0.6250,  0.0000],
          [ 2.5000,  2.7500,  3.2500,  2.6250,  0.8750,  0.0000],
          [ 2.2500,  2.4375,  2.8125,  2.2500,  0.7500,  0.0000],
          [ 0.7500,  0.8125,  0.9375,  0.7500,  0.2500,  0.0000],
          [ 0.0000,  0.0000,  0.0000,  0.0000,  0.0000,  0.0000]]]])

>>> m = nn.Upsample(scale_factor=2, mode='bilinear', align_corners=True)
>>> # Notice that values in top left corner are now changed
>>> m(input_3x3)
tensor([[[[ 1.0000,  1.4000,  1.8000,  1.6000,  0.8000,  0.0000],
          [ 1.8000,  2.2000,  2.6000,  2.2400,  1.1200,  0.0000],
          [ 2.6000,  3.0000,  3.4000,  2.8800,  1.4400,  0.0000],
          [ 2.4000,  2.7200,  3.0400,  2.5600,  1.2800,  0.0000],
          [ 1.2000,  1.3600,  1.5200,  1.2800,  0.6400,  0.0000],
          [ 0.0000,  0.0000,  0.0000,  0.0000,  0.0000,  0.0000]]]])

Whether align_corners is False or True, the left top corner is always 1.

Oh, my mistake. Now I get it.

6 Likes

I will show you a 1-dimension example.

Suppose that you want to resize tensor [0, 1] to [?, ?, ?, ?], so the factor=2.
Now we only care about coordinates.

For mode=‘bilinear’ and align_corners=False, the result is the same with opencv and other popular image processing libraries (I guess). Corresponding coordinates are [-0.25, 0.25, 0.75, 1.25] which are calculate by x_original = (x_upsamle + 0.5) / 2 - 0.5. Then you can these coordinates to interpolate.

For mode=‘bilinear’ and align_corners=True, corresponding coordinates are [0, 1/3, 2/3, 1]. From this, you can see why this is called align_corners=True.

I will be very happy if you find this answer useful.

29 Likes

Talk is cheap, show you the code!

# align_corners = False
# x_ori is the coordinate in original image
# x_up is the coordinate in the upsampled image
x_ori = (x_up + 0.5) / factor - 0.5
# align_corners = True
# h_ori is the height in original image
# h_up is the height in the upsampled image
stride = (h_ori - 1) / (h_up - 1)
x_ori_list = []
# append the first coordinate
x_ori_list.append(0)
for i in range(1, h_up - 1):
    x_ori_list.append(0 + i * stride)
# append the last coordinate
x_ori_list.append(h_ori - 1)
7 Likes

I have the same doubt. The corners are the same, what is the difference?

Here is a simple illustration I made showing how a 4x4 image is upsampled to 8x8.

When align_corners=True, pixels are regarded as a grid of points. Points at the corners are aligned.

When align_corners=False, pixels are regarded as 1x1 areas. Area boundaries, rather than their centers, are aligned.

81 Likes

Thanks for your image! Do you know in semantic segmentation task should we use align_corners=True or align_corners=False?

In semantic segmentation task should we use align_corners=True or align_corners=False ?

we should set true for better performance.

1 Like

set align_corners=True

1 Like

Hi,

Thank you for the nice picture. Can we add it to the kornia documentation for explaining the difference?

Sure. Feel free to use it.

1 Like

Awesome pic thanks!
I had issue with my Unet not being equivariant to translations. Turns out align_corners=True was the culprit. As your figure clearly shows in the “True” case there is a shift between the input and output grids that depends on location. This add spatial bias in Unets. Your pic helped see this fast.

FYI the latest documentation is inconsistent about how bicubic works with align_corners: Upsample — PyTorch master documentation

For the align_corners argument documentation bicubic is not included after “This only has effect when mode is”. However in the warning section below that bicubic is mentioned and align_corners does seem to affect it when used:

import torch
import torch.nn.functional as F
import numpy as np

z = torch.from_numpy(np.array([
    [1, 2],
    [3, 4]
], dtype=np.float32)[np.newaxis, np.newaxis, ...])

print(F.interpolate(z, scale_factor=2, mode='bicubic', align_corners=True)[0, 0])
print(F.interpolate(z, scale_factor=2, mode='bicubic', align_corners=False)[0, 0])
tensor([[1.0000, 1.3148, 1.6852, 2.0000],
        [1.6296, 1.9444, 2.3148, 2.6296],
        [2.3704, 2.6852, 3.0556, 3.3704],
        [3.0000, 3.3148, 3.6852, 4.0000]])
tensor([[0.6836, 1.0156, 1.5625, 1.8945],
        [1.3477, 1.6797, 2.2266, 2.5586],
        [2.4414, 2.7734, 3.3203, 3.6523],
        [3.1055, 3.4375, 3.9844, 4.3164]])