I have a model in which the Loss is maximizing the Entropy(not cross-entropy) of the output. ie. I’m trying to minimize the negative Entropy.

`H = - sum(p(x).log(p(x)))`

Let’s say:

```
def HLoss(res):
S = nn.Softmax(dim = 1)
LS = nn.LogSoftmax(dim = 1)
b = S(res) * LS(res)
b = torch.mean(b,1)
b = torch.sum(b)
return b
m = model()
#m is [BatchSize*3] output.
g = HLoss(m)
g.backward()
```

Would this calculate the gradients for m -> model() ?

Is there some way to check if the gradients are calculated?

3 Likes

I would create a new `Module`

:

```
class HLoss(nn.Module):
def __init__(self):
super(HLoss, self).__init__()
def forward(self, x):
b = F.softmax(x, dim=1) * F.log_softmax(x, dim=1)
b = -1.0 * b.sum()
return b
criterion = HLoss()
x = Variable(torch.randn(10, 10))
w = Variable(torch.randn(10, 3), requires_grad=True)
output = torch.matmul(x, w)
loss = criterion(output)
loss.backward()
print(w.grad)
```

I don’t really know why you calculate the `mean`

of `b`

, so just add it to the code, if you need it.

17 Likes

@ptrblck Oh, sorry the mean was a mistake.

Is there a specific reason why you suggest to use a class instead of a function?

The function also provides the (w.grad). Just wondering?

Also, if my goal is to maximize the Entropy then which should be preferred:

- Changing
`b = b.sum() #Not multiplying it by -1.`

And then minimizing that.
- Minimizing
`-H`

.

Or is it the same thing, implementation wise.

Also, if I was to get the output from a Model, is there any way to check if the gradients for the model is being calculated or not?

I think it’s just a matter of taste and apparently I like the `Module`

class, since it looks “clean” to me. All parameters are defined in the `__init__`

while the `forward`

method just applies the desired behavior. Using a function would work as well of course, since my `Module`

is stateless.

If you would like to maximize the entropy, you could just remove the multiplication with `-1`

.

Assuming your model has a layer called `linear1`

, you can check the gradients with: `model.linear1.weight.grad`

.

2 Likes