Multi Label Classification in pytorch

justusschock · July 26, 2018, 3:42pm

That should depend on your label type. If you do mutlilabel classification (with multiple singular-valued class indices as result) I would recommend to calculate an accuracy/F1 score per class. If you do for example multilabel segmentation I would also recommend a per-class evaluation for example evaluating each segmentation map with dice coefficient or something similar.

Evaluating each class on it’s own also has the advantage that it is easier traceable if your model does perform bad for only a single class.

Josiane_Rodrigues · July 30, 2018, 10:47pm

Hi,

Could anyone give me an example of how it would look like a ResNet or another network to use BCELoss for multilabel classification? I’m using the MultiLabelSoftMarginLoss function, but the accuracy is getting very low.

ptrblck · July 31, 2018, 12:36am

Here is a very simple dummy example:

model = nn.Linear(20, 5) # predict logits for 5 classes
x = torch.randn(1, 20)
y = torch.tensor([[1., 0., 1., 0., 0.]]) # get classA and classC as active

criterion = nn.BCEWithLogitsLoss()
optimizer = optim.SGD(model.parameters(), lr=1e-1)

for epoch in range(20):
    optimizer.zero_grad()
    output = model(x)
    loss = criterion(output, y)
    loss.backward()
    optimizer.step()
    print('Loss: {:.3f}'.format(loss.item()))

Yupeng_Su · July 31, 2018, 9:52pm

Anyone knows that if number of classes, say up to one million labels, is huge how to do multi label classification?

Thank you.

justusschock · July 31, 2018, 9:57pm

This is somehow the case in NLP. You could encode your classes with an embedding and than train a regression network to predict embedded values.

Yupeng_Su · August 1, 2018, 3:41pm

good idea, but could you offer more details about it? or any resources about your method? Since the task input is image, and output are tags for this image, I have no idea how to use embedding to do multi-label classification.

Josiane_Rodrigues · August 1, 2018, 5:53pm

Thank you, @ptrblck. I’m testing here

SpandanMadan · August 4, 2018, 6:32am

Here’s some slides on evaluation.

The metrics can be very easily implemented in python.

Josiane_Rodrigues · August 6, 2018, 12:54pm

Thanks, @SpandanMadan. Do you have any examples of how to calculate precision for multilabel in python using hamming distance?

SpandanMadan · August 9, 2018, 2:24am

Not sure how hamming distance can be used in measuring classification?

In multiclass, any sample belongs to only one class. In multilabel, they can simultaneously belong to multiple classes. However, a sample can either be completely present in a class, or not. That’s what makes classification a discrete problem in the output variable.

How would you incorporate hamming distance into this? Can you explain in detail your problem to give more context?

Biswajit_Biswas · August 9, 2018, 8:49am

Multi-Label classification problems can be solved by using pytorch. I successfully have done (e.g. 10 species monkey classification ).

Josiane_Rodrigues · August 9, 2018, 12:32pm

My problem is image retrieval. I use a deep neural network to generate hashing codes for images. In the retrieval phase I calculate the hamming distance as similarity between the images.

rchavezj · October 16, 2018, 3:47pm

I was wondering how’d you managed to set up the dataloader for the multi-label?

Fractale · November 8, 2018, 7:22am

Do you know why now? I have the problem has you!

HamedH25 · September 2, 2019, 5:31pm

Hi
I wan to use the multi label classification but in my project the order of classes is crucial. For example if it predicts class number 4 and 10, it is not equal to class 10 and 4. I don’t know how to use BCE because both of the classes should be 1, would you please advice regarding my issue. what would be my loss function, is there any loss function which I can use in my problem?
thank you so much,

SpandanMadan · September 27, 2019, 6:18pm

Hamming distance doesn’t seem like a good metric to be honest. Hamming distance treats every position of the string equally, are you sure you want to do that?

Can you give some more context please. What’s the input, output, post examples of them?

adarsh_pandey · December 27, 2019, 8:54am

i have two categories of image to classify…everything is cool…but the only problem is that
how do i get the image from the label …my NN is giving me the label …like 0 and 1
how do i know 0 is for which image and 1 is for which …I use the datasets.ImageLoader and then i used the DataLoader for getting the batches…
plesehelp

yong_xu · April 7, 2020, 1:54pm

In your example, there is no batch. How to add batch training in your example, what is more, how to calculate loss and hamming loss or f1 score.?

ptrblck · April 8, 2020, 2:59am

You can just increase the samples in the batch dimension, if you want to use more than a single sample:

model = nn.Linear(20, 5) # predict logits for 5 classes
x = torch.randn(2, 20)
y = torch.tensor([[1., 0., 1., 0., 0.],
                  [1., 0., 1., 0., 0.]]) # get classA and classC as active

To calculate metrics you could use e.g. sklearn.metrics.

rishabhjoshi · April 23, 2020, 6:07am

Did you figure it out?