From the definition of CrossEntropyLoss:
input has to be a 2D Tensor of size (minibatch, C).
This criterion expects a class index (0 to C-1) as the target for each value of a 1D tensor of size
My last dense layer gives dim (mini_batch, 23*N_classes), then I reshape it to (mini_batch, 23, N_classes)
So for my task, I reshape the output of the last dense layer and softmax it along dim=2, so the output and target is of the following shape :
I get a predicted output that has shape (minibatch, 23, N_classes), so a given outputted sample is basically the predicted one_hot vector (each of the 23 rows can only be of one class)
What I want to do, basically, is to use CrossEntropyLoss for each of these rows of each sample, i.e. a 2D CrossEntropyLoss.
for a single sample, a given output (after softmax), looks something like : (Here is an example with smaller dimension, (n=2, 3, 5) instead of (n, 23, 25)
>>>target
tensor([[[0, 1, 0, 0, 0],
[0, 0, 0, 0, 1],
[0, 1, 0, 0, 0]],
[[1, 0, 0, 0, 0],
[0, 0, 0, 1, 0],
[0, 0, 0, 1, 0]]])
>>>x = torch.randn((2, 3, 5)).view(-1,3,5)
>>>out = F.softmax(x,dim = 2)
>>>out
tensor([[[0.2093, 0.1281, 0.5016, 0.0836, 0.0773],
[0.1146, 0.0575, 0.1194, 0.4064, 0.3021],
[0.2026, 0.2265, 0.3767, 0.0556, 0.1385]],
[[0.0473, 0.2789, 0.0782, 0.3650, 0.2306],
[0.0054, 0.2677, 0.0643, 0.5199, 0.1427],
[0.4490, 0.2397, 0.2088, 0.0787, 0.0237]]])
Is there a way to use the CrossEntropyLoss for a 2D target (so final dimension would be 3D, (batchsize, dim1, dim2)), i.e.
criterion = nn.CrossEntropyLoss()
loss = criterion(target, out)