About Multilabel

Hi everyone,

I’m working with a multilabel dataset for image retrieval, but the precision value is getting very low. I have already tested the loss functions that Pytorch offers to solve this problem: BCEWithLogitsLoss and MultiLabelSoftMarginLoss. However, the performance of both functions is poor. As I see that there are some people with the same problem I would like to know if there is anyone here in the forum who has already worked with the multilabel problem and has achieved good results and could provide a tutorial on how to handle the multilabel problem in Pytorch. Or if there is already a tutorial in this sense could make available here.
Thank you

It is not easy to answer your question/request as there are a lot of problems that might cause this low precision (I guess you are calculating the mAP). First, you have to make sure your code is correct, then, you have to check the data-loaders, are you loading the data correctly? I usually do this in the debug-mode, and check if everything is correct, you may also need to do some exploratory analysis on your images; for example, do some histogram statistics, visual inspection on randomly picked samples, and check if they are correctly labeled, etc. One thing that might really be helpful to know is how much your mAP results differ from chance-level? Have other people reported better results than yours on the same dataset? If so, there must be a bug or more in your code or the way of doing things (algorithms), etc.
To get feedback on the multi-label classification, maybe you could post how are you doing it and maybe you will have some feedback from the forum.

Thanks for the feedback.
I’m trying to adapt this implementation https://github.com/flyingpot/pytorch_deephash from a 2015 method for image retrieval, now using it for a multilabel dataset (the NUS-WIDE). I’m using the same dataset preprocessing used in this implementation https://github.com/jiangqy/ADSH-AAAI2018/tree/master/ADSH_pytorch which gets an approximately 80% mAP, however it implements a specific function of its method.

Briefly, what an image retrieval method that uses deep learning does is take the representation of the image generated by the network, it applies some kind of binarization (in the case of this method it does a simple rounding up or down to generate codes of 0s and 1s), then use this code to calculate similarity between the images to make retrieval.

I’ve already checked the dataloader and it looks okay, but now I’ll check again with more caution. If you have any suggestions, I thank you.

I would then check if the training set is balanced, and try to play with the binary encoder. I do not think the loss is the issue here.