Please see the following thread for an implementation:
You should use LogSoftmax. You have to pass the output of Softmax
through log() anyway to calculate the cross entropy, and the
implementation of LogSoftmax is numerically more stable than (the
mathematically, but not numerically equivalent) log (Softmax).
If you don’t naturally have soft target labels (probabilities across the
classes), I don’t see any value in ginning up soft labels by adding
noise to your 0, 1 (one-hot) labels. Just use CrossEntropyLoss
with your hard labels.
(If your hard labels are encoded as 0, 1-style one-hot labels you will
have to convert them to integer categorical class labels, as those are
what CrossEntropyLoss requires.)
I want to use smooth labeling with the criterion=nn.CrossEntropyLoss() with batch size of 64. The labels are random number between 0.8 to 0.9 and the outputs are from sigmoid. The code is
label=(0.9-0.8)* torch.rand(b_size) + 0.8
label=label.to(device).type(torch.LongTensor)
# Forward pass real batch through D
netD=netD.float()
output = netD(real_cpu).view(-1)
# Calculate loss on all-real batch
output1=torch.zeros(64,64)
for ii in range(64):
output1[:,ii]=ii
for ii in range(64):
output1[ii,:]= output[ii].type(torch.LongTensor)
errD_real = criterion(output1, label)
and the error is:
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
By applying (torch.LongTensor) all the labels and output become 0! and without (torch.LongTensor) it gave me error.