Cumulative sum with decay factor

Hello,

I need to have following operation:

y[i] = w * \sum_{j=1}^{i} x[j] + x[i]

Which is equivalent to torch.cumsum() if w := 1. However, in cases where we need to decay the summation by having 0 < w < 1, cumsum() won’t work. Any suggestions about how to go ahead in solving this problem?

Thanks,
Ahmed

Hello Ahmed!

As written, this doesn’t actually have a decay factor. (Each value of
y only has at most a single power of w.)

I think you mean to write something like

y[i] = w * y[i - 1] + x[i]

You can get what (I think) you want by pre-multiplying x with powers
of the weight, and then using cumsum(), all using pytorch tensor
operations, and no explicit loops.

Here is a (pytorch version 0.3.0) demonstration script:

import torch
torch.__version__

# weighted cumulative sum with tensor operations
def wtCumSum (vector, wt):
    wts = wt**((len (vector) - 1.0) - torch.FloatTensor (range (len (vector))))
    return torch.cumsum (wts * vector, dim = 0) / wts

torch.manual_seed (2020)

vector = torch.randn (10,)
wt = 0.5

wcs = wtCumSum (vector, wt)

vector
wt
wcs

# check against non-tensor iterative calculation
wcs2 = []
wcs2.append (vector[0])
for  i in range (len(vector) - 1):
    wcs2.append (wt * wcs2[i] + vector[i+1])

wcs2

wcs - torch.FloatTensor (wcs2)

And here is the output:

>>> import torch
>>> torch.__version__
'0.3.0b0+591e73e'
>>>
>>> # weighted cumulative sum with tensor operations
... def wtCumSum (vector, wt):
...     wts = wt**((len (vector) - 1.0) - torch.FloatTensor (range (len (vector))))
...     return torch.cumsum (wts * vector, dim = 0) / wts
...
>>> torch.manual_seed (2020)
<torch._C.Generator object at 0x000001BEE6816630>
>>>
>>> vector = torch.randn (10,)
>>> wt = 0.5
>>>
>>> wcs = wtCumSum (vector, wt)
>>>
>>> vector

 1.2372
-0.9604
 1.5415
-0.4079
 0.8806
 0.0529
 0.0751
 0.4777
-0.6759
-2.1489
[torch.FloatTensor of size 10]

>>> wt
0.5
>>> wcs

 1.2372
-0.3418
 1.3706
 0.2775
 1.0193
 0.5626
 0.3564
 0.6559
-0.3480
-2.3229
[torch.FloatTensor of size 10]

>>>
>>> # check against non-tensor iterative calculation
... wcs2 = []
>>> wcs2.append (vector[0])
>>> for  i in range (len(vector) - 1):
...     wcs2.append (wt * wcs2[i] + vector[i+1])
...
>>> wcs2
[1.2372283935546875, -0.34178459644317627, 1.370636761188507, 0.2774530053138733, 1.0193056166172028, 0.5625850260257721, 0.3564097583293915, 0.6558640152215958, -0.34795021265745163, -2.3229051791131496]
>>>
>>> wcs - torch.FloatTensor (wcs2)

 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
[torch.FloatTensor of size 10]

Please note that this illustrates the idea, but is written only for a
one-dimensional vector.

Also note that if wt differs much from one and vector is rather
long, then the powers of wt will underflow to zero, and you will
get 0 / 0 nans in the result of this version.

So for real work, you should write out the loop with the conventional
iterative formula, losing autograd and the performance benefit of
tensor operations (or implement a tensor version of weighted
cumsum()).

Best.

K. Frank

1 Like

Thanks for your response.

1. Factoring out decay factor:

So yes in principle the idea is correct to implement weighted cumulative sum as:

y[i] = (y[i-1] + x[i]/(w^i)) / w^i

But the array is rather long so can have a loss of precision.

2. For loops:
The second way of using for loops is also viable but I have to see if there are much performance drop (I do not need backpropagation anyway).

3. Implement tensor version:
Have never done this before, but would like to give it a try. Can you recommend how to give it a shot?

Thanks!

Hello Ahmed!

Since you don’t need your weighted cumulative sum for
backpropagation, I assume that you are using it for neither
training nor inference.

As such, any loss of performance shouldn’t matter much to
your overall workflow, so I wouldn’t worry about it.

Good luck!

K. Frank

Hello,

Half of your assumption is wrong :stuck_out_tongue: I use it during inference.

Thanks,
Ahmed

Check out a package I made for this purpose: GitHub - toshas/torch-discounted-cumsum: Fast Discounted Cumulative Sums in PyTorch

Just set up a neet little function based on the above idea - and also reversing the direction of the cumsum, as is typical in most RL applications:

def discount_cumsum(vector, discount, device='cpu'): 
    wts = discount**torch.arange(len(vector), dtype = torch.float64, device=device) # pre-compute all discounts
    x = wts*vector # weighted vector
    cum_sum = torch.cumsum(x, dim=0) #forward cumsum
    re_cum_sum = x - cum_sum + cum_sum[-1:] #reversed cumsum
    return re_cum_sum/wts

As mentioned above, beware of precision issues when vectors get long and decay factors go to zero. Also note that the method for reversed cumsum here is faster than using torch.flip

1 Like