I’m looking to compute the discrete entropy (not cross-entropy) of a multidimensional tensor in my loss function. Consider the tensor to have shape `BATCH_S x D`

, where each data point is a `D`

dimensional vector, and I have `BATCH_S`

of them. The formula requires me to compute the probability of seeing each data point, and I can’t seem to find a way to do this while retaining the ability to do a backward pass. To do so, all I need is the frequency of each element because I can divide that by the sum of the frequencies.

There’s torch.histc, but that doesn’t work for multidimensional data, and neither does torch.bincount. There’s also torch.unique which has the `return_count`

optional param, but that’s has no `grad_fn`

.

Any help is appreciated! Thanks