r1 = torch.randn(5) r2 = torch.tensor([-float('inf') for k in range(5)]) dp = 0. for i in range(r1.shape): dp+=(r1[i]*r2[i])
So it seems adding -inf multiple times gives a nan. This creates a problem while implementing masked attention.
Is there a way in which adding -inf multiple times still gives me -inf ? I am not looking for a hack involving some conditional expression. Perhaps, I am looking for a solution which makes -inf idempotent with respect to addition.