Is that possible to train the weights in CrossEntropyLoss?

I am trying to train the weight used in CrossEntropyLoss, and my code is something like this:

...
loss_weights = nn.Parameter(nn.Softmax()(torch.randn(n_classes, device=device, requires_grad=True)))
criterion = nn.CrossEntropyLoss(weight=loss_weights, size_average=False)
optimizer_ft = optim.SGD(list(mdl.parameters()) + list(loss_weights), lr=0.001, momentum=0.9)
...

But is report ‘can’t optimized non-leaf Tensor’ error.

How is that possible to do this?

1 Like

The problem with this statement is that a leaf tensor is being created (torch.randn(..., requires_grad=True)) and then it is being hidden because nn.Softmax() returns a new tensor.

To make this work, try something like:

initial_weights = nn.Softmax()(torch.randn(n_classes, device=device)
loss_weights = torch.zeros_like(initial_weights, requires_grad=True)
# Don't record the following operation in autograd
with torch.no_grad():
    loss_weights.copy_(initial_weights)

and then proceed to optimize loss_weights

Hi @richard, thanks for your kind help. Finally I changed my code in some other way but got another error saying 'the derivative for 'weight' is not implemented', so it seems impossible to optimize the weights in CrossEntropyLoss function, and on the official website, it is said manual rescaling weight. Maybe I need to use some log function from torch to implement the cross entropy loss by myself.

you cannot have them trainable. your weightage might go to negative infinity.

Hi Naman,

Thanks. The softmax operation prevents infinity.