Improve the indexing spped

rickhuang · December 28, 2021, 7:48am

Hi, x is a 4-D tensor and y is a 2-D tensor， and i want to replace the value of x by corresponding value in y with x as index. Like the code I write below. Because x.shape[0] is so big, it will make my code be very slow, so I want to know whether some quick method to do this by some torch function. Thanks

index_array = x.clone().long()
for i in range(0,x.shape[0]):
    x[i] = y[i][index_array[i]]

ptrblck · December 30, 2021, 1:47am

Direct indexing with broadcasting should work:

x = torch.randint(0, 10, (10, 2, 3, 4)).float()
y = torch.randn(10, 10)

index_array = x.clone().long()
for i in range(0,x.shape[0]):
    x[i] = y[i][index_array[i]]


x2 = y[torch.arange(y.size(0))[:, None, None, None], index_array]
print((x - x2).abs().max())
# > tensor(0.)

rickhuang · December 31, 2021, 4:18am

Thanks. It can work.

rickhuang · January 26, 2022, 3:27am

Sorry for disturbing you.
I want to ask that do you know any methods for facilitate this indexing?
Because I need to replace the huge size of activation with look-up tables, this step is the bottleneck for my training.
Or broadcasting is the fastest method build in the pytorch?
I don’t know whether I can use gpu or other resources to parallel implement it?
Thanks.

ptrblck · January 26, 2022, 5:05am

If you are already pushing the tensors to the GPU, the GPU would also be used in the indexing operation. I’m not aware of a quick speedup. How much time does this indexing take compared to other operations?

rickhuang · January 26, 2022, 3:37pm

I only add this to the activation for every layer.
Then, the computing time for one epoch will increase from 5s to 44s for 10000 data.
And, the storage increase much.
Thanks

rickhuang · January 26, 2022, 3:47pm

class Mapping_multi(torch.autograd.Function):
    @staticmethod
    def forward(ctx, input, num_partial_sum):
        output = input.clone()
        with torch.no_grad():
            error_bias_array_real = (torch.multinomial(error_distribution, num_partial_sum, replacement=True) - 35).float().permute(1,0)
            error_array_real = error_bias_array_real + identity_array[None, : ]
            error_array = error_array_real.float()
            value_index = (output+35).long()
            output = error_array[torch.arange(num_partial_sum)[:, None, None, None], value_index]
        return output

    @staticmethod
    def backward(ctx, grad_output):
           grad_input = grad_output.clone()
        return grad_input, None

Here is my code, if I add this function after the convolution at every layer, it will increase much time.
This function is that I want to replace the activation to error value.
error array is 2-d tensor with error value, I want replace the activation according to its value with error array.