Doing operations on different parts of a tensor in parallel

antspy · September 15, 2017, 4:06pm

Hi,

assume I have a tensor, and I want to apply a certain function to buckets of its elements. For example, assume I have the tensor

>>> A 
 0
 2
 0
 2
[torch.FloatTensor of size 4x1]

and I want to compute the mean for every “bucket” of two elements, and replace it, like so:

>>> for idx in range(0, 4, 2):
    A[idx:idx+2] = torch.mean(A[idx:idx+2] )

>>> A 
 1
 1
 1
 1
[torch.FloatTensor of size 4x1]

The issue is that for loop may be very slow, since it has to execute A.numel()/2 times. Is there any way to make it parallelizable, so that it runs in parallel over multiple buckets at the same time?

Note: The example with the mean is just to clarify what I meant, the actual function is slower and more complicated.

Thank you!

smth · September 15, 2017, 6:50pm

you can use torch.unfold to compute your bucket tensor, and then you can compute the mean for each bucket.

http://pytorch.org/docs/master/tensors.html?highlight=unfold#torch.Tensor.unfold

antspy · September 16, 2017, 1:06pm

Thank you, that worked well and sped it up more than 10x! Cheers

antspy · September 17, 2017, 9:29am

Actually, in my case the buckets don’t overlap, so I think it would be better to just use

tensor.view(-1, bucket_size)

which avoids creating a new tensor like unfold would.

mu2 · October 30, 2020, 11:55am

@smth @antspy I’m sorry I don’t understand the answer. I could bucket the tensor using unfold or view, but how do I apply the function (mean in this case) in parallel?

ptrblck · October 31, 2020, 2:35am

After creating the “buckets” or windows using unfold, you could flatten the elements in this bucket and apply e.g. torch.mean on this dimension.