Effective way to repeat indices in batch with batch of indices lengths?

Lets say i have to tensors
x (len, batch, feature) and duration (len, batch)

And I want to convert it for example like this:

x (1, 2, 3, 4) + dur (3, 3, 2, 1) = > (1, 1, 1, 2, 2, 2, 3, 3, 4)

So repeat each element with value for second tensor times.

For now i made this code, but it is super ineffective

max_len = dur.sum(0).max()

out_tensor = torch.zeros((max_len, x.size(1), x.size(2))).cuda()
for idx in range(x.size(1)):
    text_expanded = torch.cat([text_.repeat(len_, 1)
                                for text_, len_ in zip(x[:, idx], dur[:, idx])])
    out_tensor[:len(text_expanded), idx] = text_expanded

Try this.

>>> import torch
>>> x = torch.tensor([1, 2, 3, 4])
>>> torch.repeat_interleave(x, torch.tensor([3,3,2,1]))
tensor([1, 1, 1, 2, 2, 2, 3, 3, 4])

https://pytorch.org/docs/stable/torch.html#torch.repeat_interleave

1 Like

I need repeat with batches
For example
x is input tensor and dur tensor with indices lengths from 0 to 3

import torch
x = torch.rand((50, 16, 128))
dur = (torch.rand((50, 16))*3).long()

Then torch.repeat_interleave(x, dur) will raise RuntimeError: repeats must be 0-dim or 1-dim tensor

I don’t think that can be possible for more than 0-dim or 1-dim as for 2-d case, if the dur is different for each row that will imply that each row will have different number of elements, which is not supported.

Eg.

x = [[1,2,3,4],
     [1,2,3,4]]

dur = [[1,1,1,1],
       [2,2,2,2]] # Notice different for 2nd row.

# Expected output (which is not supported)
output = [[1,2,3,4],
          [1,1, 2, 2, 3, 3, 4, 4]]

Well, zero padding would be ok.
But i didnt found way to do so in “batch” style, and iterating 2 times consume a lot of time.

Sorry but even I am not aware of a way to do it directly using the operators for more than 1-d Tensor.

Not sure if this answer is helpful or not, but do check.