What we should use align_corners = False

I am very confused with this parameter in pytroch document. According to wiki
https://en.wikipedia.org/wiki/Bilinear_interpolation, the bilinear interpolation formula result is consistent with
align_corners =True. which is defatult before pytorch 0.4.0.
I want to know when should use align_corners=False ??

same problem here.
Someone can help?

Have you seen the note and examples under Upsample? I think they do a great job in explaining why.

Yeah, I check the example under Upsample. I don’t get it.

>>> input_3x3 = torch.zeros(3, 3).view(1, 1, 3, 3)
>>> input_3x3[:, :, :2, :2].copy_(input)
tensor([[[[ 1.,  2.],
          [ 3.,  4.]]]])
>>> input_3x3
tensor([[[[ 1.,  2.,  0.],
          [ 3.,  4.,  0.],
          [ 0.,  0.,  0.]]]])

>>> m = nn.Upsample(scale_factor=2, mode='bilinear')  # align_corners=False
>>> # Notice that values in top left corner are the same with the small input (except at boundary)
>>> m(input_3x3)
tensor([[[[ 1.0000,  1.2500,  1.7500,  1.5000,  0.5000,  0.0000],
          [ 1.5000,  1.7500,  2.2500,  1.8750,  0.6250,  0.0000],
          [ 2.5000,  2.7500,  3.2500,  2.6250,  0.8750,  0.0000],
          [ 2.2500,  2.4375,  2.8125,  2.2500,  0.7500,  0.0000],
          [ 0.7500,  0.8125,  0.9375,  0.7500,  0.2500,  0.0000],
          [ 0.0000,  0.0000,  0.0000,  0.0000,  0.0000,  0.0000]]]])

>>> m = nn.Upsample(scale_factor=2, mode='bilinear', align_corners=True)
>>> # Notice that values in top left corner are now changed
>>> m(input_3x3)
tensor([[[[ 1.0000,  1.4000,  1.8000,  1.6000,  0.8000,  0.0000],
          [ 1.8000,  2.2000,  2.6000,  2.2400,  1.1200,  0.0000],
          [ 2.6000,  3.0000,  3.4000,  2.8800,  1.4400,  0.0000],
          [ 2.4000,  2.7200,  3.0400,  2.5600,  1.2800,  0.0000],
          [ 1.2000,  1.3600,  1.5200,  1.2800,  0.6400,  0.0000],
          [ 0.0000,  0.0000,  0.0000,  0.0000,  0.0000,  0.0000]]]])

Whether align_corners is False or True, the left top corner is always 1.

Oh, my mistake. Now I get it.

2 Likes

I will show you a 1-dimension example.

Suppose that you want to resize tensor [0, 1] to [?, ?, ?, ?], so the factor=2.
Now we only care about coordinates.

For mode=‘bilinear’ and align_corners=False, the result is the same with opencv and other popular image processing libraries (I guess). Corresponding coordinates are [-0.25, 0.25, 0.75, 1.25] which are calculate by x_original = (x_upsamle + 0.5) / 2 - 0.5. Then you can these coordinates to interpolate.

For mode=‘bilinear’ and align_corners=True, corresponding coordinates are [0, 1/3, 2/3, 1]. From this, you can see why this is called align_corners=True.

I will be very happy if you find this answer useful.

6 Likes

Talk is cheap, show you the code!

# align_corners = False
# x_ori is the coordinate in original image
# x_up is the coordinate in the upsampled image
x_ori = (x_up + 0.5) / factor - 0.5
# align_corners = True
# h_ori is the height in original image
# h_up is the height in the upsampled image
stride = (h_ori - 1) / (h_up - 1)
x_ori_list = []
# append the first coordinate
x_ori_list.append(0)
for i in range(1, h_up - 1):
    x_ori_list.append(0 + i * stride)
# append the last coordinate
x_ori_list.append(h_ori - 1)
1 Like

I have the same doubt. The corners are the same, what is the difference?

Here is a simple illustration I made showing how a 4x4 image is upsampled to 8x8.

When align_corners=True, pixels are regarded as a grid of points. Points at the corners are aligned.

When align_corners=False, pixels are regarded as 1x1 areas. Area boundaries, rather than their centers, are aligned.

12 Likes

Thanks for your image! Do you know in semantic segmentation task should we use align_corners=True or align_corners=False?