 # What we should use align_corners = False

I am very confused with this parameter in pytroch document. According to wiki
https://en.wikipedia.org/wiki/Bilinear_interpolation, the bilinear interpolation formula result is consistent with
align_corners =True. which is defatult before pytorch 0.4.0.
I want to know when should use align_corners=False ??

same problem here.
Someone can help?

Have you seen the note and examples under `Upsample`? I think they do a great job in explaining why.

Yeah, I check the example under `Upsample`. I don’t get it.

``````>>> input_3x3 = torch.zeros(3, 3).view(1, 1, 3, 3)
>>> input_3x3[:, :, :2, :2].copy_(input)
tensor([[[[ 1.,  2.],
[ 3.,  4.]]]])
>>> input_3x3
tensor([[[[ 1.,  2.,  0.],
[ 3.,  4.,  0.],
[ 0.,  0.,  0.]]]])

>>> m = nn.Upsample(scale_factor=2, mode='bilinear')  # align_corners=False
>>> # Notice that values in top left corner are the same with the small input (except at boundary)
>>> m(input_3x3)
tensor([[[[ 1.0000,  1.2500,  1.7500,  1.5000,  0.5000,  0.0000],
[ 1.5000,  1.7500,  2.2500,  1.8750,  0.6250,  0.0000],
[ 2.5000,  2.7500,  3.2500,  2.6250,  0.8750,  0.0000],
[ 2.2500,  2.4375,  2.8125,  2.2500,  0.7500,  0.0000],
[ 0.7500,  0.8125,  0.9375,  0.7500,  0.2500,  0.0000],
[ 0.0000,  0.0000,  0.0000,  0.0000,  0.0000,  0.0000]]]])

>>> m = nn.Upsample(scale_factor=2, mode='bilinear', align_corners=True)
>>> # Notice that values in top left corner are now changed
>>> m(input_3x3)
tensor([[[[ 1.0000,  1.4000,  1.8000,  1.6000,  0.8000,  0.0000],
[ 1.8000,  2.2000,  2.6000,  2.2400,  1.1200,  0.0000],
[ 2.6000,  3.0000,  3.4000,  2.8800,  1.4400,  0.0000],
[ 2.4000,  2.7200,  3.0400,  2.5600,  1.2800,  0.0000],
[ 1.2000,  1.3600,  1.5200,  1.2800,  0.6400,  0.0000],
[ 0.0000,  0.0000,  0.0000,  0.0000,  0.0000,  0.0000]]]])
``````

Whether `align_corners` is `False` or `True`, the left top corner is always 1.

Oh, my mistake. Now I get it.

2 Likes

I will show you a 1-dimension example.

Suppose that you want to resize tensor [0, 1] to [?, ?, ?, ?], so the factor=2.
Now we only care about coordinates.

For mode=‘bilinear’ and align_corners=False, the result is the same with opencv and other popular image processing libraries (I guess). Corresponding coordinates are [-0.25, 0.25, 0.75, 1.25] which are calculate by x_original = (x_upsamle + 0.5) / 2 - 0.5. Then you can these coordinates to interpolate.

For mode=‘bilinear’ and align_corners=True, corresponding coordinates are [0, 1/3, 2/3, 1]. From this, you can see why this is called align_corners=True.

I will be very happy if you find this answer useful.

6 Likes

Talk is cheap, show you the code!

``````# align_corners = False
# x_ori is the coordinate in original image
# x_up is the coordinate in the upsampled image
x_ori = (x_up + 0.5) / factor - 0.5
``````
``````# align_corners = True
# h_ori is the height in original image
# h_up is the height in the upsampled image
stride = (h_ori - 1) / (h_up - 1)
x_ori_list = []
# append the first coordinate
x_ori_list.append(0)
for i in range(1, h_up - 1):
x_ori_list.append(0 + i * stride)
# append the last coordinate
x_ori_list.append(h_ori - 1)
``````
1 Like

I have the same doubt. The corners are the same, what is the difference?

Here is a simple illustration I made showing how a 4x4 image is upsampled to 8x8.

When `align_corners=True`, pixels are regarded as a grid of points. Points at the corners are aligned.

When `align_corners=False`, pixels are regarded as 1x1 areas. Area boundaries, rather than their centers, are aligned.

12 Likes

Thanks for your image! Do you know in semantic segmentation task should we use `align_corners=True` or `align_corners=False`?