Multi-label and Multi-class text classification with Bert

Hi everyone,

i’m using the script run_ner from huggingface transformers to perform PoS tagging task with conll-u dataset.

Now I would like to do two tasks together: predict both the PoS tag and the head of each word, always in the conll-u format.

To do this my idea is to use the pre-trained bert model as it is made available by the run_ner script, remove its last layer and add two dense layers to do the two classification tasks at the same time.

The model is as follows:

config = AutoConfig.from_pretrained(
        model_args.config_name if model_args.config_name else model_args.model_name_or_path,
        label2id={label: i for i, label in enumerate(labels)},
tokenizer = AutoTokenizer.from_pretrained(
        model_args.tokenizer_name if model_args.tokenizer_name else model_args.model_name_or_path,
model = AutoModelForTokenClassification.from_pretrained(
        from_tf=bool(".ckpt" in model_args.model_name_or_path),

to remove the last layer I did:

model = nn.Sequential(*list(model.children())[:-1])

but now I have two problems:

  1. I don’t know how to add the two nn.Linear() layers to the model
  2. I don’t know how to pass the label to the model for both the Pos-tag and the head

Thanks in advance!

I added a new model in the following way:

model = nn.Sequential(*list(model.children())[:-1], NewModule(num_labels))

where NewModule() is:

class NewModule(nn.Module):
    def __init__(self, num_labels):
        super(NewModule, self).__init__()
        self.classifierDeprel = nn.Linear(768, num_labels)
        self.classifierRelPos = nn.Linear(768, num_labels)

    def forward(self, x):
        x1 = torch.softmax(self.fc1(x))
        y1 = torch.softmax(self.fc2(x))
        return x1, y1

Now the problem is that after instantiating the trainer in this way

trainer = Trainer(

I can’t train the model

            model_path=model_args.model_name_or_path if os.path.isdir(model_args.model_name_or_path) else None

I think because the model_path is no longer Bert’s one, but it’s my new model, but i don’t know how to specify it.

you probably need two loss functions. or at least do one loss on one set of labels and another on the other (which can be the same loss function if you like).

Hey, I am trying to do a multi-label-multi-class text classification project as well.
Were you able to make any progress on this that you could share?
I have not come across many resources on how to actually get this done…
Thank you!

Hi @attari.

In a nutshell what I did was split this problem into two different multi-class problems. In the first task I use a certain set of labels to label each word of the sentence (each word has an exclusive label, such as PoS-tagging). In the second task I use a different set of labels to do the same thing. Finally, I evaluate together the results obtained after the predictions of the two tasks made separately.

Looking at the legacy run_ner example, I basically created two different tasks in the file