Optimizing Performance by Reducing Redundancy in Looping through PyTorch Tensors

Samet_Copur · April 23, 2024, 12:02am

Dear PyTorch Community,

I’m currently working on a project where I need to populate a tensor ws_expanded based on certain conditions using a nested loop structure. However, I’ve noticed that reconstructing this loop each time incurs a significant computational cost. Here’s the relevant portion of the code for context:

ws_expanded = torch.empty_like(y_rules, device=y_rules.device, dtype=y_rules.dtype)
index = 0

for col, rules in enumerate(rule_paths):
for rule in rules:
mask = y_rules[:, col] == rule
ws_expanded[mask, col] = ws[index][0]
index += 1

As you can see, the nested loops iterate over rule_paths and rules to populate ws_expanded based on certain conditions. However, as the size of the tensors increases, reconstructing this loop becomes prohibitively expensive.

I’m exploring ways to optimize this process. Specifically, I’m wondering if there’s a way to assign the weights (ws) to ws_expanded permanently using pointers in PyTorch, thus eliminating the need to reconstruct the loop every time.

Could you please advise on the best approach to handle this situation? Any insights or alternative strategies would be greatly appreciated.

Thank you in advance for your assistance.

KFrank · April 24, 2024, 6:46pm

Hi Samet!

I’m guessing you mean something like the following:

You have some values in ws and you assign them into ws_expanded with
a loop. If you could have done this with “pointers,” then if the values in ws
were to change, the corresponding values in ws_expanded would also
change without rerunning the loop (because the pointers would point to
the new values).

If this is what you want, you can do something conceptually similar by running
your loop once to pre-compute the appropriate indices and use those indices
to assign values of ws into ws_expanded.

When the values of ws change, you do have to reassigned them into
ws_expanded, but you don’t have to recompute the indices, so you don’t
have to rerun the loop.

Consider:

>>> import torch
>>> print (torch.__version__)
2.2.2
>>>
>>> ws_expanded = torch.zeros (2, 3, 4)
>>> ws = torch.arange (4.).reshape (2, 2) + 1
>>>
>>> # pre-compute indices somehow (perhaps using a loop)
>>> ind0 = torch.tensor ([[1, 1], [1, 1]])   # note that the index-tensors have the same shape as ws
>>> ind1 = torch.tensor ([[0, 0], [2, 2]])
>>> ind2 = torch.tensor ([[0, 1], [2, 3]])
>>>
>>> ws_expanded[ind0, ind1, ind2] = ws       # use indices to assign the original ws values into ws_expanded
>>>
>>> ws
tensor([[1., 2.],
        [3., 4.]])
>>> ws_expanded
tensor([[[0., 0., 0., 0.],
         [0., 0., 0., 0.],
         [0., 0., 0., 0.]],

        [[1., 2., 0., 0.],
         [0., 0., 0., 0.],
         [0., 0., 3., 4.]]])
>>>
>>> ws += 100                                # new values for ws
>>>
>>> ws_expanded[ind0, ind1, ind2] = ws       # reuse indices with the new ws values (no loop)
>>>
>>> ws
tensor([[101., 102.],
        [103., 104.]])
>>> ws_expanded
tensor([[[  0.,   0.,   0.,   0.],
         [  0.,   0.,   0.,   0.],
         [  0.,   0.,   0.,   0.]],

        [[101., 102.,   0.,   0.],
         [  0.,   0.,   0.,   0.],
         [  0.,   0., 103., 104.]]])

Best.

K. Frank