# How to define a information entropy loss?

(Fangwei123456) #1

I try to define a information entropy loss. The input is a tensor(1*n), whose elements are all between [0, 4]. The EntroyLoss will calculate its information entropy loss.
For exampe, if the input is

[0,1,0,2,4,1,2,3]

then

p(0) = 2 / 8 = 0.25
p(1) = 2 / 8 = 0.25
p(2) = 2 / 8 = 0.25
p(3) = 1 / 8 = 0.125
p(4) = 1 / 8 = 0.125

so information entropy loss is

Loss = -( p(0)*log2(p(0)) + p(1)*log2(p(1)) + p(2)*log2(p(2)) + p(3)*log2(p(3)) )

my code is here:

``````class EntroyLoss(nn.Module):
def __init__(self):
super(EntroyLoss, self).__init__()
def forward(self, x):
y = x.view(-1)
p = torch.zeros([5])
for i in range(y.shape[0]):
p[y[i].int()] = p[y[i].int()] + 1

p = p.float() / y.shape[0]

entropy = -p.mul(p.log2()).sum()
return entropy
``````

But pytorch can not calcualate grad:

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

#2

You can use `totch.histc()` to define an information entropy, but another error occurs:

RuntimeError: the derivative for ‘histc’ is not implemented

``````class EntroyLoss(nn.Module):
def __init__(self):
super(EntroyLoss, self).__init__()
def forward(self, x):
y = x.view(-1)
p = torch.histc(y, bins=5, min=0, max=4)
p = p / y.shape[0]

entropy = -p.mul(p.log2()).sum()
return entropy
``````

(Fangwei123456) #3

Thanks! I find it in https://pytorch.org/docs/stable/torch.html#torch.histc. But I got this:
_th_histic is not implemented for type torch.cuda.FloatTensor

#4

Can you define a derivative of histogram?

#5

Math Problem
Given a vector X=(…, x_i ,…), H(X) denotes a histogram of X. Define a derivative of H(X) with respect to the i-th value x_i.

A Solution
X_i(+) and X_i(-) denote two vector made by replacing the i-th value of X with x_i + 1 and x_i - 1, respectively. We can define an interpolated histogram

``````I(X, h) = [ (1+h) H(X_i(+)) + (1-h) H(X_i(-)) ] / 2
``````

for -1 < h < 1. Differentiating this interpolated histogram with respect to h, we get

``````dI(X,h)/dh =  [ H(X_i(+)) - H(X_i(-)) ] / 2.
``````

@fangwei123456 If you need this derivative, you can define a backward() method as described in Defining new autograd functions.

(Fangwei123456) #6

Thanks! I will try and see it.