Loss function for Multi-Label Multi-Classification

I have a Multi-Labeling Multi-Classification problem and I am wondering which loss function should I use.

My labels are positions and types of objects. There is 64 positions and each item could be 0,1,2

Example for label is: [0., 0., 0., 2., 0., 0., 0., 0., 2., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 2., 0., 0., 0., 2., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 2., 0., 0., 0., 0.]

0 means there is no object on this position
1 means there is an object on this position from type 1
and 2 means there is an object on this position from type 2

I have checked the BCEWithLogitsLoss and this post, and I am not sure whether it applies to my problem. I tested the BCEWithLogitsLoss function with one example and the result did not make sense, when output = target the result is bigger than 0 (it is 0.68) and when output != target the result is bigger (0.70). In the example I replaced the 2 with 0.5, since the targets t[i] should be numbers between 0 and 1.

You can simply use CrossEntropyLoss to do this. CrossEntropyLoss now supports any dimensional input. Your final output should have shape of (N, C, X) where N is batch size C is total classes which is 3 in your case X is total positions which is 64 in your case. During inference you can do argmax on dim=1 and get indices of classes.