Hi,

I know there is KL divergence loss in pytorch.

but the limitation with KL divergence is that sum of the distribution should be 1.

however, my target distribution is like 1d-Gaussian distributions where the sum of the distribution is not equal to1.

for example,

```
{0,0,0,0.01,0.02,...0.8,0.9,1.0,0.9,0.8........0.02,0.01,0,0,0,0}
```

I tried to model this as a multi-label sigmoid cross entropy. for example,

```
target * (pred) + 1 - target(1-pred)
```

however, I found out this is not a good loss to model this problem.

for example, say that one of the target probability is 0.6. when pred is approaching to 0.6, the loss is not approaching to 0.

```
-(0.6 * (torch.log(0.6)) + 0.4 * (torch.log(0.4)))= 0.9163
```

I tried to model this problem as L1 loss, but the result is pretty bad.

any good advice for modeling this problem?