```
r1 = torch.randn(5)
r2 = torch.tensor([-float('inf') for k in range(5)])
dp = 0.
for i in range(r1.shape[0]):
dp+=(r1[i]*r2[i])
```

```
dp
```

Output

```
tensor(nan)
```

So it seems adding -inf multiple times gives a nan. This creates a problem while implementing masked attention.

Is there a way in which adding -inf multiple times still gives me -inf ? I am not looking for a hack involving some conditional expression. Perhaps, I am looking for a solution which makes -inf idempotent with respect to addition.