Unexpected behaviour for affine_grid and grid_sample with 3D inputs

EmilienDupont · April 1, 2020, 1:57pm

Hi, I posted this issue on github and was told to ask about it on here. Thanks for your help!

Bug

When performing affine transformations on 3D tensors I’ve noticed some unexpected behaviour which does not occur for 2D tensors.

Specifically, suppose we perform two 3D rotations with two rotation matrices A and B. Given a 3D input tensor x, we can first rotate it with B to get the rotated tensor x_B. We can then further rotate x_B by A to get the rotation corresponding to B followed by A, i.e. to the rotation matrix AB. We can denote this resulting tensor by x_AB.

Alternatively, we could have directly taken the matrix product C = AB and transformed the tensor with C, to obtain x_C. Now x_AB and x_C should be approximately equal up to sampling errors. However in practice this is not the case. As an example:

Clearly, the cube rotated by B followed by A is not the same as the one rotated by C=AB. Note that this does not happen in 2D. If we perform similar transformations to a 2D input, we obtain:

As can be seen, the transformations by AB and C are roughly the same (AB has more sampling noise since it corresponds to two affine transformations and two grid_samples), so the error does not occur in 2D.

Any idea why the difference in 3D might be occurring? Thank you!

To Reproduce

import torch

def get_3d_cube(grid_size=32, cube_size=16):
    """Returns a 3d cube which is useful for debugging and visualizing 3d
    rotations.

    Args:
        grid_size (int): Size of sides of 3d grid.
        cube_size (int): Size of sides of cube within grid.
    """
    cube = torch.zeros(1, 1, grid_size, grid_size, grid_size)
    hs = grid_size // 2  # Half grid size
    hcs = cube_size // 2  # Half cube size

    cube[:, :, -hcs + hs:hs + hcs, hs + hcs, hs + hcs] = 1
    cube[:, :, -hcs + hs:hs + hcs, hs + hcs, hs - hcs] = 1
    cube[:, :, -hcs + hs:hs + hcs, hs - hcs, hs + hcs] = 1
    cube[:, :, -hcs + hs:hs + hcs, hs - hcs, hs - hcs] = 1

    cube[:, :, hs + hcs, -hcs + hs:hs + hcs, hs + hcs] = 1
    cube[:, :, hs - hcs, -hcs + hs:hs + hcs, hs + hcs] = 1
    cube[:, :, hs + hcs, -hcs + hs:hs + hcs, hs - hcs] = 1
    cube[:, :, hs - hcs, -hcs + hs:hs + hcs, hs - hcs] = 1

    cube[:, :, hs + hcs, hs + hcs, -hcs + hs:hs + hcs] = 1
    cube[:, :, hs - hcs, hs + hcs, -hcs + hs:hs + hcs] = 1
    cube[:, :, hs + hcs, hs - hcs, -hcs + hs:hs + hcs] = 1
    cube[:, :, hs - hcs, hs - hcs, -hcs + hs:hs + hcs] = 1
    
    return cube

# Create a cube to use for debugging
cube = get_cube_3d(128, 64)

# 3D rotation matrices with shape (1, 3, 3)
rotation_A = torch.tensor([[[ 0.7198, -0.6428, -0.2620],
                            [ 0.5266,  0.7516, -0.3973],
                            [ 0.4523,  0.1480,  0.8795]]])

rotation_B = torch.tensor([[[ 0.6428, 0.0000,  0.7660],
                            [ 0.0000,  1.0000,  0.0000],
                            [-0.7660,  0.0000,  0.6428]]])

# C = AB, so C is a rotation by B, followed by a rotation by A
rotation_C = rotation_matrixA @ rotation_matrixB

# Convert to affine matrices of shape (1, 3, 4), where the 4th column
# corresponds to a zero translation
translations = torch.zeros(1, 3, 1)
affine_A = torch.cat([rotation_A, translations], dim=2)
affine_B = torch.cat([rotation_B, translations], dim=2)
affine_C = torch.cat([rotation_C, translations], dim=2)

# Get grid for each affine transform
grid_A = torch.nn.functional.affine_grid(affine_A, size=cube.shape)
grid_B = torch.nn.functional.affine_grid(affine_B, size=cube.shape)
grid_C = torch.nn.functional.affine_grid(affine_C, size=cube.shape)

# Rotate cube by B
cube_B = torch.nn.functional.grid_sample(cube, grid_B, mode="bilinear")
# Rotate resulting cube by A, to obtain AB transformation
cube_AB = torch.nn.functional.grid_sample(cube_B, grid_A, mode="bilinear")
# Rotate original cube by C
cube_C = torch.nn.functional.grid_sample(cube, grid_C, mode="bilinear")

print((cube_AB - cube_C).abs().sum())  # This is large

Expected behavior

Rotating a tensor by B followed by A should be approximately the same as rotating by AB.

Environment

PyTorch Version (e.g., 1.0): 1.4.0
OS (e.g., Linux): macOS Catalina
How you installed PyTorch (conda, pip, source): pip
Build command you used (if compiling from source): N/A
Python version: 3.6.8
CUDA/cuDNN version: N/A
GPU models and configuration: N/A
Any other relevant information: N/A

EmilienDupont · April 2, 2020, 3:20pm

I’ve just tested this on GPU and the CUDA function also has the exact same problem.

This is the setup:

PyTorch Version (e.g., 1.0): 1.3.1
OS (e.g., Linux): Linux
How you installed PyTorch (conda, pip, source): conda
Build command you used (if compiling from source): N/A
Python version: 3.6.9
CUDA/cuDNN version: 9.1
GPU models and configuration: Tesla V100
Any other relevant information: N/A

MicPie · May 17, 2020, 10:18am

Hi Emilien,

I am currently also playing with affine_grid and grid_sample and found your post via the forum search.

When I rotate 3D data in my setup I get a rotated and sheared output.

So far I am still trying to find the root cause but I expect that something unexpected happens in grid_sample I am not taking into account (currently I’m looking into the implications of this github comment, but I still have to think about it).

Have you tried to see if you rotation is preserving edge lengths of your cube, i.e., it only rotates and does not shear too? (I guess a small shear can be quite tricky to see in the 3D visualizations.)

Maybe a more basic setup where you only rotate in one dimension can help you pinpoint the problem better?

Kind regards
Michael

MicPie · May 24, 2020, 9:50am

I was able to track down my errors by carrying everything out in “world” coordinates and then creating the affine grid from them.

I had to do a lot of “visual” debugging with https://github.com/K3D-tools/K3D-jupyter which I used for rendering my point clouds and input/output data (really a great tool for that).

EmilienDupont · June 24, 2020, 4:32pm

Hi Michael,

Thanks for your reply!

The problem ended up being that grid_sample performs an inverse warping, which means that passing an affine_grid for the matrix A actually corresponds to the transformation A^(-1). So in my example above, the transformation with B followed by A actually corresponds to A^(-1)B^(-1) = (BA)^(-1), which means I should use C = BA and not C = AB as would be the case if the original matrix was used by grid_sample.

The reason why it worked in 2D and not 3D is because 2D rotation matrices commute whereas 3D rotation matrices do not.

It looks like you basically found the same fix by using world coordinates!

Thanks again for your replies and good to hear you fixed your problems too.