Logpt = logpt.gather(1,target) IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1)

Thanks a lot for your answer and clarification. I confirm that using the same args as in CrossEntropyLoss fixed the problem.

Since I initially used this other implementation Is this a correct implementation and use of focal loss for binary classification on vision transformer output? If it is correct, why are all train and val preds still stuck at zero? of FocalLoss which required Sigmoid, by mistake I used same line of code for this other implementation used in this post.

#loss = criterion(m(output[:,1]-output[:,0]), labels.float())
loss = criterion(output, labels)