Inconsistent results with cross_entropy

When running cross_entropy with indices to ignore and no reduction, I get inconsistent results in the ignored positions. For example:

In [547]: scores
Out[547]: 
tensor([[0.6453, 0.0335, 0.8880, 0.7096, 0.6212],
        [0.3379, 0.0559, 0.8633, 0.3935, 0.2379],
        [0.8813, 0.2753, 0.9798, 0.4739, 0.1096],
        [0.3567, 0.8076, 0.1867, 0.1396, 0.0971],
        [0.6270, 0.1037, 0.1204, 0.2856, 0.8055],
        [0.4446, 0.0313, 0.6597, 0.9088, 0.3563],
        [0.6929, 0.4242, 0.9795, 0.7334, 0.0760],
        [0.2506, 0.3346, 0.2624, 0.8448, 0.4569],
        [0.9056, 0.2093, 0.7258, 0.8028, 0.3810],
        [0.4425, 0.4331, 0.1988, 0.1506, 0.1138]])

In [548]: gold
Out[548]: tensor([ 2,  0,  0,  4, -1,  1,  0,  0, -1,  2])

In [549]: F.cross_entropy(scores, gold, ignore_index=-1, reduction='none')
Out[549]: 
tensor([1.3382e+00, 1.6877e+00, 1.3288e+00, 1.8669e+00, 9.9920e-16, 2.1011e+00,
        1.5427e+00, 1.8149e+00, 0.0000e+00, 1.6885e+00])

In [550]: F.cross_entropy(scores, gold, ignore_index=-1, reduction='none')
Out[550]: 
tensor([1.3382e+00, 1.6877e+00, 1.3288e+00, 1.8669e+00, 4.2039e-45, 2.1011e+00,
        1.5427e+00, 1.8149e+00, 0.0000e+00, 1.6885e+00])

In [551]: F.cross_entropy(scores, gold, ignore_index=-1, reduction='none')
Out[551]: 
tensor([1.3382e+00, 1.6877e+00, 1.3288e+00, 1.8669e+00, 2.4612e-28, 2.1011e+00,
        1.5427e+00, 1.8149e+00, 0.0000e+00, 1.6885e+00])

In [552]: F.cross_entropy(scores, gold, ignore_index=-1, reduction='none')
Out[552]: 
tensor([1.3382e+00, 1.6877e+00, 1.3288e+00, 1.8669e+00, 7.0065e-45, 2.1011e+00,
        1.5427e+00, 1.8149e+00, 2.2695e-21, 1.6885e+00])

In [553]: F.cross_entropy(scores, gold, ignore_index=-1, reduction='none')
Out[553]: 
tensor([1.3382e+00, 1.6877e+00, 1.3288e+00, 1.8669e+00, 4.2039e-45, 2.1011e+00,
        1.5427e+00, 1.8149e+00, 0.0000e+00, 1.6885e+00])

In [554]: F.cross_entropy(scores, gold, ignore_index=-1, reduction='none')
Out[554]: 
tensor([1.3382e+00, 1.6877e+00, 1.3288e+00, 1.8669e+00, 8.1065e+16, 2.1011e+00,
        1.5427e+00, 1.8149e+00, 2.3510e-38, 1.6885e+00])

In the last example, one of the ignored positions had a loss of 8e16!

So, is it working as expected, since these positions are to be ignored and the reduction apparently works fine? Even if it is, I find it very strange not to get always zeros in the ignored positions.

I tried to reproduce this issue, but get the same values for each call:

tensor([1.338268518448e+00, 1.687772154808e+00, 1.328836917877e+00, 1.866955399513e+00,
        0.000000000000e+00, 2.101119518280e+00, 1.542701721191e+00, 1.814857602119e+00,
        0.000000000000e+00, 1.688515305519e+00])

Also, I set torch.set_printoptions(precision=12, sci_mode=True) to get these outputs.
Which PyTorch version are you using?

I was using torch 1.0.1. I tried 1.1.0 now, and it gives consistently the same result.

1 Like