Why F.SoftMax gives exact numbers of a matrix across rows

Hi all!

I have a question regarding Softmax.

suppose I have two tensors with a batch size like this

import torch
import torch.nn.functional as F

A = torch.Tensor([[1]
                ,[2]
                ,[3]]).float()[None] , # tensor shape >> (1,3,1)

B = torch.Tensor([[5],
                  [2],
                  [6]]).float()[None] # tensor shape >> (1,3,1)

outer_product =A.view(A.shape[0], A.shape[1], -1) * B.view(B.shape[0],-1, B.shape[1])

#outer_product: 
#tensor([[[ 5.,  2.,  6.],
#         [10.,  4., 12.],
#         [15.,  6., 18.]]])

outpro_SM = F.softmax(out_prod, dim=-1)
# output of outpro_SM is:
#tensor([[[2.6539e-01, 1.3213e-02, 7.2140e-01],
#         [1.1917e-01, 2.9539e-04, 8.8054e-01],
#         [4.7426e-02, 5.8528e-06, 9.5257e-01]]])

Until this point everything seems to work well, though, I when I change the operation of the outer product and do for example outer addition like this

out_add =A.view(A.shape[0], A.shape[1], -1) + B.view(B.shape[0],-1, B.shape[1])
# output of out_add is:
# tensor([[[6., 3., 7.],
#         [7., 4., 8.],
#         [8., 5., 9.]]])

outadd_SM = F.softmax(out_add , dim=-1)
# output of outadd_SM is 
#tensor([[[0.2654, 0.0132, 0.7214],
#         [0.2654, 0.0132, 0.7214],
#         [0.2654, 0.0132, 0.7214]]])

I am very confused why the softmax of the out_add gives a matrix of similar numbers for each row. Am I doing somthing wrong here?

Any help is greatly appreciated

Hi Omnia!

This is to be expected.

softmax() converts “unnormalized” log-probabilities to probabilities, and
in the process normalizes them. That is why softmax() can map different
inputs to the same outputs. Consider:

>>> import torch
>>> torch.__version__
'2.0.0'
>>> t1 = torch.arange (5.)
>>> t2 = t1 - 1.5
>>> t1
tensor([0., 1., 2., 3., 4.])
>>> t2
tensor([-1.5000, -0.5000,  0.5000,  1.5000,  2.5000])
>>> t1.softmax (dim = 0)
tensor([0.0117, 0.0317, 0.0861, 0.2341, 0.6364])
>>> t2.softmax (dim = 0)
tensor([0.0117, 0.0317, 0.0861, 0.2341, 0.6364])

Best.

K. Frank

Really interesting! That makes sense…

Thanks a lot Frank!

Regards,
Omnia