Multi-label models trained on single-label data

I wonder if it is a good idea to train a multi-label model on a single-labelled data set (using e.g. BCELoss). Is this likely to generalize well to images with multiple of the categories present? Even more important, possibly: would such a model be likely to respond correctly if no category is present?

Has anyone tried this?

Just found this article myself:

It claims that learning from single labels can work even when the single labeled training data actually had multiple classes represented in each image. (My plan was to use training images with just a single class present in each image.)