Doubt about how to properly use 3D affine transformation

Hi, let’s say I have the grid grid, a 3D representation, of size (size, size, size) Captura de pantalla 2020-03-20 a la(s) 13.36.10 and I’d like to apply some rotation, scaling and translation (R, S, T) to it (all 4x4 in homogenous coordinates, T = [Identity(4,3) | t], Identity(4,3) is and identity matrix of 4 rows and 3 columns and t a vector of size 4 with 1 in its last position).

The equivalent transformation is defined as
theta = torch.bmm(torch.bmm(T, S), R)

To generate the sampling positions I make
sample_grid = affine_grid(theta, (batch_size, num_channels, new_size, new_size, new_size))
But according to the implementation this does this:

Tensor base_grid = make_base_grid_5D(theta, N, C, D, H, W, align_corners);
auto grid = base_grid.view({N, D * H * W, 4}).bmm(theta.transpose(1, 2));

And from my understanding of Multiview Geometry (wihch is very scarce so there’s I high chance I’m wrong), this calculates where a point in the new grid would be mapped to in the old grid, by applying the transformation theta. But since what I want is to apply a rotation, scaling and translation to the original grid, I’d have to use the inverse of that theta to achieve it.
Math explanation

# The final position (x', y', z') as applying a rotation R, scaling S and translation T to a point (x,y,z) can be computed as:
# \theta * (x, y, z) = T * S * R * (x, y ,z) = (x', y', z')
# Multiplying by the inverses
# (x, y, z) = R^-1 * S^-1 * T^-1 * (x', y', z')

This way I correctly apply the transformation to my original grid, and not the other way arround (the transform to the new grid)

Final question: Do I use theta or theta^-1 to create the sampling grid?

Please let me know if I failed to explain clearly my doubt and if I’m wrong the reasons.
Thanks!