Same probability for every class

I trained Transformer model for mask word prediction but my model output always same for every class

tensor([[[-2.3026, -2.3026, -2.3026,  ..., -2.3026, -2.3026, -2.3026],
         [-2.3026, -2.3026, -2.3026,  ..., -2.3026, -2.3026, -2.3026],
         [-2.3026, -2.3026, -2.3026,  ..., -2.3026, -2.3026, -2.3026],
         ...,
         [-2.3026, -2.3026, -2.3026,  ..., -2.3026, -2.3026, -2.3026],
         [-2.3026, -2.3026, -2.3026,  ..., -2.3026, -2.3026, -2.3026],
         [-2.3026, -2.3026, -2.3026,  ..., -2.3026, -2.3026, -2.3026]]],
       device='cuda:0', grad_fn=<LogSoftmaxBackward>)

After torch.exp()

tensor([[[0.1000, 0.1000, 0.1000,  ..., 0.1000, 0.1000, 0.1000],
         [0.1000, 0.1000, 0.1000,  ..., 0.1000, 0.1000, 0.1000],
         [0.1000, 0.1000, 0.1000,  ..., 0.1000, 0.1000, 0.1000],
         ...,
         [0.1000, 0.1000, 0.1000,  ..., 0.1000, 0.1000, 0.1000],
         [0.1000, 0.1000, 0.1000,  ..., 0.1000, 0.1000, 0.1000],
         [0.1000, 0.1000, 0.1000,  ..., 0.1000, 0.1000, 0.1000]]],
       device='cuda:0', grad_fn=<ExpBackward>)

What can be wrong here? Thanks in advance

1 Like