Hi, I posted this issue on github and was told to ask about it on here. Thanks for your help!

## Bug

When performing affine transformations on 3D tensors I’ve noticed some unexpected behaviour which does not occur for 2D tensors.

Specifically, suppose we perform two 3D rotations with two rotation matrices `A`

and `B`

. Given a 3D input tensor `x`

, we can first rotate it with `B`

to get the rotated tensor `x_B`

. We can then further rotate `x_B`

by `A`

to get the rotation corresponding to `B`

followed by `A`

, i.e. to the rotation matrix `AB`

. We can denote this resulting tensor by `x_AB`

.

Alternatively, we could have directly taken the matrix product `C = AB`

and transformed the tensor with `C`

, to obtain `x_C`

. Now `x_AB`

and `x_C`

should be approximately equal up to sampling errors. However in practice this is not the case. As an example:

Clearly, the cube rotated by `B`

followed by `A`

is not the same as the one rotated by `C=AB`

. Note that this does not happen in 2D. If we perform similar transformations to a 2D input, we obtain:

As can be seen, the transformations by `AB`

and `C`

are roughly the same (`AB`

has more sampling noise since it corresponds to two affine transformations and two grid_samples), so the error does not occur in 2D.

Any idea why the difference in 3D might be occurring? Thank you!

## To Reproduce

```
import torch
def get_3d_cube(grid_size=32, cube_size=16):
"""Returns a 3d cube which is useful for debugging and visualizing 3d
rotations.
Args:
grid_size (int): Size of sides of 3d grid.
cube_size (int): Size of sides of cube within grid.
"""
cube = torch.zeros(1, 1, grid_size, grid_size, grid_size)
hs = grid_size // 2 # Half grid size
hcs = cube_size // 2 # Half cube size
cube[:, :, -hcs + hs:hs + hcs, hs + hcs, hs + hcs] = 1
cube[:, :, -hcs + hs:hs + hcs, hs + hcs, hs - hcs] = 1
cube[:, :, -hcs + hs:hs + hcs, hs - hcs, hs + hcs] = 1
cube[:, :, -hcs + hs:hs + hcs, hs - hcs, hs - hcs] = 1
cube[:, :, hs + hcs, -hcs + hs:hs + hcs, hs + hcs] = 1
cube[:, :, hs - hcs, -hcs + hs:hs + hcs, hs + hcs] = 1
cube[:, :, hs + hcs, -hcs + hs:hs + hcs, hs - hcs] = 1
cube[:, :, hs - hcs, -hcs + hs:hs + hcs, hs - hcs] = 1
cube[:, :, hs + hcs, hs + hcs, -hcs + hs:hs + hcs] = 1
cube[:, :, hs - hcs, hs + hcs, -hcs + hs:hs + hcs] = 1
cube[:, :, hs + hcs, hs - hcs, -hcs + hs:hs + hcs] = 1
cube[:, :, hs - hcs, hs - hcs, -hcs + hs:hs + hcs] = 1
return cube
# Create a cube to use for debugging
cube = get_cube_3d(128, 64)
# 3D rotation matrices with shape (1, 3, 3)
rotation_A = torch.tensor([[[ 0.7198, -0.6428, -0.2620],
[ 0.5266, 0.7516, -0.3973],
[ 0.4523, 0.1480, 0.8795]]])
rotation_B = torch.tensor([[[ 0.6428, 0.0000, 0.7660],
[ 0.0000, 1.0000, 0.0000],
[-0.7660, 0.0000, 0.6428]]])
# C = AB, so C is a rotation by B, followed by a rotation by A
rotation_C = rotation_matrixA @ rotation_matrixB
# Convert to affine matrices of shape (1, 3, 4), where the 4th column
# corresponds to a zero translation
translations = torch.zeros(1, 3, 1)
affine_A = torch.cat([rotation_A, translations], dim=2)
affine_B = torch.cat([rotation_B, translations], dim=2)
affine_C = torch.cat([rotation_C, translations], dim=2)
# Get grid for each affine transform
grid_A = torch.nn.functional.affine_grid(affine_A, size=cube.shape)
grid_B = torch.nn.functional.affine_grid(affine_B, size=cube.shape)
grid_C = torch.nn.functional.affine_grid(affine_C, size=cube.shape)
# Rotate cube by B
cube_B = torch.nn.functional.grid_sample(cube, grid_B, mode="bilinear")
# Rotate resulting cube by A, to obtain AB transformation
cube_AB = torch.nn.functional.grid_sample(cube_B, grid_A, mode="bilinear")
# Rotate original cube by C
cube_C = torch.nn.functional.grid_sample(cube, grid_C, mode="bilinear")
print((cube_AB - cube_C).abs().sum()) # This is large
```

## Expected behavior

Rotating a tensor by `B`

followed by `A`

should be approximately the same as rotating by `AB`

.

## Environment

- PyTorch Version (e.g., 1.0): 1.4.0
- OS (e.g., Linux): macOS Catalina
- How you installed PyTorch (
`conda`

,`pip`

, source): pip - Build command you used (if compiling from source): N/A
- Python version: 3.6.8
- CUDA/cuDNN version: N/A
- GPU models and configuration: N/A
- Any other relevant information: N/A