Hey!
I’ve noticed that for sufficiently high resolution of images (e.g. full HD) I cannot get a good enough “identity grid” using torch.float32
precision, code below.
Posting here:
- to give a heads up to others
- to double check - am I right this just a numerical precision issue?
I saw @bnehoran mention in the issue on github
Allow option for sampling using only the residual displacement/flow. Eliminates the need to constantly add an identity grid to the flow/displacement field, which is imprecise, slow, and very prone to user error.
If what I’m seeing is what I think I’m seeing it would be a pretty good motivation to add a function which just takes residual displacement as input.
Code:
import torch.nn.functional as F
W = 1920
H = 1080
shape = [1, 1, H, W]
dtype = torch.float
x_prev = torch.zeros(shape, dtype=dtype)
x_prev[..., :, 0, 0] = 1.
x_next_expected = x_prev.clone()
base_grid = torch.stack((torch.linspace(-1, 1, W, dtype=dtype).unsqueeze(0).repeat(H, 1),
torch.linspace(-1, 1, H, dtype=dtype).unsqueeze(-1).repeat(1, W),), dim=-1)
base_grid = base_grid.unsqueeze(0)
x_next = F.grid_sample(x_prev, base_grid, align_corners=True)
print((x_next_expected - x_next).abs().sum())
print(x_next[..., :2, :2])
print(x_next_expected[..., :2, :2])
print(x_next[..., :2, :2] - x_next_expected[..., :2, :2])
assert torch.allclose(x_next_expected, x_next)