Computing softmax cross entropy for smoothed labels

tremblerz · June 11, 2018, 1:11am

Hi!
I am trying to compute softmax_cross_entropy_with_logits in PyTorch. I couldn’t get the existing APIs working because of the smoothed labels. In my case where logits and labels have shape [2,3,4], I currently use following function -

def softmax_and_cross_entropy(logits, labels):
  return -(labels * nn.LogSoftmax(dim=2)(logits)).sum(dim=2)

I would like to know if there is a better way to go about it so that the function could be in a more pytorch style and backward could also get faster.

tom · June 11, 2018, 8:57am

Hi,

I think you want torch.nn.functional.log_softmax instead of the module. I’m not sure about the output shape you want (maybe an additional mean or sum somewhere?), but other than that it looks very reasonable to me.

Best regards

Thomas

tremblerz · June 11, 2018, 4:13pm

Thanks for the reply. nn.LogSoftmax uses torch.nn.functional.log_softmax in its backend so I guess both are same as far as functioning and speed is concerned.

cstsunfu · September 28, 2018, 5:28pm

I have the same question. Do you solve this problem？

atisman89 · November 5, 2018, 1:03pm

I have the same problem. Could anyone share a solution here?

tremblerz · January 14, 2019, 10:44am

Here is a snippet from my implementation, where I basically assign smoothed label to whole tensor and then later add high confidence labels. Note that your case of smoothing might be different and the function is specific for my requirement of computation over a language model -

   with torch.no_grad():
    confidence = 1.0 - smoothing
    low_confidence = (1.0 - confidence) / labels.new_tensor(vocab_size - 1)
    soft_targets = labels.data.new_zeros(*(list(labels.shape)+[vocab_size])) + low_confidence
    soft_targets.scatter_(2, labels.unsqueeze(dim=2).long(), confidence)

   xentropy = softmax_and_cross_entropy(logits, soft_targets)