Calculating the Entropy loss

I have a model in which the Loss is maximizing the Entropy(not cross-entropy) of the output. ie. I’m trying to minimize the negative Entropy.
H = - sum(p(x).log(p(x)))
Let’s say:

def HLoss(res):
	S = nn.Softmax(dim = 1)
	LS = nn.LogSoftmax(dim = 1)
	b = S(res) * LS(res)
	b = torch.mean(b,1)
	b = torch.sum(b)
    return b

m = model()
#m is [BatchSize*3] output.
g = HLoss(m)

Would this calculate the gradients for m -> model() ?
Is there some way to check if the gradients are calculated?


I would create a new Module:

class HLoss(nn.Module):
    def __init__(self):
        super(HLoss, self).__init__()

    def forward(self, x):
        b = F.softmax(x, dim=1) * F.log_softmax(x, dim=1)
        b = -1.0 * b.sum()
        return b
criterion = HLoss()
x = Variable(torch.randn(10, 10))
w = Variable(torch.randn(10, 3), requires_grad=True)
output = torch.matmul(x, w)
loss = criterion(output)

I don’t really know why you calculate the mean of b, so just add it to the code, if you need it. :wink:


@ptrblck Oh, sorry the mean was a mistake.
Is there a specific reason why you suggest to use a class instead of a function?
The function also provides the (w.grad). Just wondering?
Also, if my goal is to maximize the Entropy then which should be preferred:

  1. Changing b = b.sum() #Not multiplying it by -1.
    And then minimizing that.
  2. Minimizing -H.

Or is it the same thing, implementation wise.
Also, if I was to get the output from a Model, is there any way to check if the gradients for the model is being calculated or not?

I think it’s just a matter of taste and apparently I like the Module class, since it looks “clean” to me. All parameters are defined in the __init__ while the forward method just applies the desired behavior. Using a function would work as well of course, since my Module is stateless.

If you would like to maximize the entropy, you could just remove the multiplication with -1.

Assuming your model has a layer called linear1, you can check the gradients with: model.linear1.weight.grad.