Multiclass accuracy and evaluation


Let me start by saying I’ve searched for this, and apart from a single post (which doesn’t answer the question) I understand it is clear on how to train for multiclass but not on how to evaluate.
I’m vectorising text which is then used as categorical input to a simple 2-hidden layer MLP.
Training with CE Loss and evaluation for single class (SoftMax) works fine, but it’s not what I want: we have multiple classes, and thus I’ve ended up using MultiLabelSoftMarginLoss which trains the network fine.
However, when trying to use the boiler template code for evaluation, it fails due to using argmax on the single class index, which doesn’t make sense in multi-class.
I’ve copied a sample from another post, and this is how far I’ve gotten:

def multi_eval(model, test_data, test_loader):
    with torch.no_grad():
        correct = 0
        total = 0
        for inputs, labels in test_loader:
            # compute output
            outputs = model(inputs)
            predicted = torch.sigmoid(outputs).data > 0.5

            # PROBLEM is HERE
            ok = torch.isclose(predicted, labels)

    return correct, total

I do not understand how to make a comparison of the predicted and labels given a small margin of error epsilon or not, in order to sum up the correct from the incorrect outputs.
I understand that in multi-class there is more complexity involved because some classes may be high and some not, I am looking for the absolute measure, e.g., either the vector matches or it doesn’t (always with a small degree of error) which is why I ended up using sigmoid on the predicted tensor.
I get errors about ByteTensor and whatnot, and I am sure this is a very simple problem to solve.