Parallel indexing inside voxels


How can I (if I can) vectorize those loops ?

(I’m trying to compute the eigenvalues of each pixel of a volume, I saw that torch.eigenvalsh can do that but it doesn’t work on GPU if my volume is too big so I’m trying to make my own implementation, if you guys have any clue on what can I do on that topic I would gladly take any help :sweat_smile:)

    A = torch.randn(200, 200, 200, 3, 3) # Hessian matrices of each voxel of input volume
    R = torch.zeros(200, 200, 200, 3, 3) # Temp matrices for each voxel of input volume
    ij = torch.randint(0, 2, size=(200, 200, 200, 2)) # Some indexes for each voxel

    def f(value: torch.float) -> torch.float:
        return value

    img_shape = A.shape[:-2]
    # For each voxel
    for x in range(img_shape[0]):
        for y in range(img_shape[1]):
            for z in range(img_shape[2]): 
                # Do some math
                i = ij[x, y, z, 0]
                j = ij[x, y, z, 1]
                R[x, y, z, i, j] = f(A[x, y, z, i, j])

I want to be able to do this operation on GPU (as I understand I cannot really use for loops if I want to do that).