The grid contains the normalized coordinates which should be used to interpolate the image.
They are normalized in [-1, 1]
and these values are mapped to the “corners” of the input (the “corner” definition depends on the align_corners
argument as well).
The docs explain it as:
For each output location
output[n, :, h, w]
, the size-2 vectorgrid[n, h, w]
specifiesinput
pixel locationsx
andy
, which are used to interpolate the output valueoutput[n, :, h, w]
. In the case of 5D inputs,grid[n, d, h, w]
specifies thex
,y
,z
pixel locations for interpolatingoutput[n, :, d, h, w]
.mode
argument specifiesnearest
orbilinear
interpolation method to sample the input pixels.
grid
specifies the sampling pixel locations normalized by theinput
spatial dimensions. Therefore, it should have most values in the range of[-1, 1]
. For example, valuesx = -1, y = -1
is the left-top pixel ofinput
, and valuesx = 1, y = 1
is the right-bottom pixel ofinput
.
If
grid
has values outside the range of[-1, 1]
, the corresponding outputs are handled as defined bypadding_mode
.