I have a positive tensor of shape (bsz , num_heads , tgt _len , src_len). I need that it sums to 1 along the last dimension (i.e src_len dimension). I don’t want to use softmax. I want to take sum of elements along the last dimension and then divide the elements by that sum.
How should I do this? Can someone help?
Thanks in advance.