Multi-output Classification?

mansoor_sh · March 17, 2023, 8:05am

I have a dataset consisting of 1 X (Textual Data) and 5 different y (Topics of each text) and each y can take values from 0 to 5. I need to develop a deep learning model to take text and predict all y. I really appreciate it if you could guide me or provide me with some resources to deal with this kind of problem in PyTorch. Also, Am I correct that this problem is considered a multi-output classification? Thanks

AbdulsalamBande · March 30, 2023, 6:24pm

Hello, I believe you can use an LSTM or a transformer ( you can use a pre-trained model for 2 class classification from huggingFace) and add an extra layer for 5 class classification. This can help you
Multi-class classification

akuysal · March 30, 2023, 9:02pm

There are some terminologies called multi-class classification and multi-label classification. If the each document in your dataset belongs to only one topic and the number of topics are more than two, you are talking about multi-class classification. If each document in your dataset may belong to one or more classes at the same time, your problem can be defined as multi-label classification.

I assume that your problem is just multi-class text classification. You can use AutoModelForSequenceClassification in huggingface pipelines. It is a kind of wrapper for PyTorch and TensorFlow. So, you may produce PyTorch models via huggingface.

As another solution, you can directly employ Pytorch seq2seq models. You may probably use FastText easily. Here is related topic that I created before. For sentiment classification including just two topics, I used FastText before as a deep learning approach.