Scatter sum or np.histogram with weights


(Stefan Doerr) #1

Hi, what I am trying to do is the following:
I have a data array A (n, m) and an index array I of same size (n, m) and a result array R (x, n).
I am trying to scatter elements of A into R while also summing up all values which scatter to the same index.
This can be done in numpy for example in 1D arrays using np.histogram with the weights option.

This is one example in numba.cuda if it helps better explain what I want to do:

@cuda.jit
def rewireValues(R, A, I, totalthreads):
    threadidx = (cuda.threadIdx.x + (cuda.blockDim.x * cuda.blockIdx.x))
    if threadidx >= totalthreads:
        return
    nj  = I.shape[1]
    nk = I.shape[2]

    idx = threadidx % nk
    source = int(threadidx / nk) % nj
    frame = int(threadidx / (nj * nk))
    target = I[frame, source, idx]
    if target == -1:
        return

    cuda.atomic.add(R, (frame, target, 0), A[frame, source, idx, 0])

#2

Could you post some dummy values for A, I and R?


(Stefan Doerr) #3

In this example

A = [[0.7,  1.3], 
     [56.1, 7. ]]
I = [[1, 2], 
     [0, 0]]

Then if R is an array of zeros and shape (2, 4) it would end up being

R = [[0,    0.7, 1.3, 0], 
     [63.1, 0,   0,   0]] 

But this I guess is a peculiar example since I assume that each row in I corresponds to a row in A. I imagine there are different ways of storing the index in I.

I see now that A, I and R must have same length first dimension since it’s the number of samples, but can have different number of columns


#4

You could achieve this with tensor.scatter_add_:

A = torch.tensor([[0.7,  1.3], 
                  [56.1, 7. ]])
I = torch.tensor([[1, 2], 
                  [0, 0]])

torch.zeros(2, 3).scatter_add_(1, I, A)

(Stefan Doerr) #5

Oh indeed! I don’t know how but in my tests in the past it didn’t work. Thank you a lot!!!


(Stefan Doerr) #6

Ok I managed to make it work with 3D arrays now but I have another question.

How do I define a non-existing index in the index array. Like if some value of A should not be summed anywhere? (in my above CUDA code I set the index to -1 and skipped it). I guess I could make a garbage column in the result array and dump all the ones I don’t want there. But maybe there is a more elegant solution?


#7

I’m not sure and would suggest to use exactly your suggestion, i.e. a garbage column.