Efficiently Accessing Specific Activations for the Entire Batch

idx is torch.cuda.LongTensor
idx dimensions are nx3 (where n is the number of indices)

How can I make the following code run faster?
this one works, but it is very slow.

inp contains all the activations, where the first dim is the number of samples in the current mini batch

for row in range(inp.size(0)):
          for x,y,z in idx:
                    inp.data[int(row),int(x),int(y),int(z)] = 1
1 Like


The following code sample should do what you want:

import torch

inp = torch.rand(2, 3, 3, 4)
idx = torch.LongTensor([[0,0,0], [1,1,1], [0, 1, 2]])

b, s1, s2, s3 = inp.size()
# Note that for a contiguous tensor,
# inp.stride(-1) = 1
# inp.stride(-2) = s3
# inp.stride(-3) = s3*s2
lin_idx = idx.select(1, -1)*inp.stride(-1) + \
          idx.select(1, -2)*inp.stride(-2) + \
          idx.select(1, -3)*inp.stride(-3)

tmp_inp = inp.view(b, s1*s2*s3)
tmp_inp.index_fill_(1, lin_idx, 1)


lin_idx = idx.select(1, 2)*inp.stride(-1) + idx.select(1, 1)*inp.stride(-2) + idx.select(1, 0)*inp.stride(-3)
I replaced your lines with this one, I think it is working properly now, thanks a lot!

Yes, the two lines are the same, I used negative indices to match with the stride indexing of the input :slight_smile:

1 Like

Your method returned an error tho, which version of PyTorch is supposed to support that syntax?

I am using 0.3.1.post2 and it returns an error (out of range)

Ho I’m using the current master branch. Maybe that is the reason.

1 Like

I think Ill start using the master branch too, I keep bumping into nice additions which are not included in the release yet :slight_smile:

thanks again!

With the current master, Variable and Tensor have been merged, check this page in the wiki to be sure it’s not going to be problematic for you.

1 Like