Trying to design a multi-label text classification

It would be great if someone can help me understand what’s wrong with this is model. The problem is that in this small test dataset, the accuracy decreases all the way to zero:

import torch

dataset = {
    "input": torch.tensor([[1, 2, 3, 24, 25, 25, 25, 25],
                           [4, 5, 2, 6, 24, 25, 25, 25],
                           [7, 8, 9, 24, 24, 25, 25, 25],
                           [4, 10, 11, 12, 24, 25, 25, 25],
                           [13, 14, 2, 15, 2, 16, 17, 24],
                           [18, 19, 20, 21, 22, 23, 3, 24]], dtype=torch.long).permute(1, 0),
     "target": torch.tensor([[1., 0.],
                             [0., 1.],
                             [1., 0.],
                             [0., 1.],
                             [1., 1.],
                             [0., 0.]], dtype=torch.float32),
}

class MltcModel(torch.nn.Module):
    def __init__(self, vocab_size, emb_dim, hid_dim, rnn_num_layers=1):
        super().__init__()

        self.embedding = torch.nn.Embedding(vocab_size, emb_dim)
        self.rnn = torch.nn.GRU(emb_dim, hid_dim, bidirectional=True, num_layers=rnn_num_layers)
        self.l1 = torch.nn.Linear(hid_dim * 2 * rnn_num_layers, 256)
        self.l2 = torch.nn.Linear(256, 2)

    def forward(self, samples):
        embedded = self.embedding(samples)

        _, last_hidden = self.rnn(embedded)

        hidden_list = [last_hidden[i, :, :] for i in range(last_hidden.size()[0])]
        encoded = torch.cat(hidden_list, dim=1)

        encoded = torch.nn.functional.relu(self.l1(encoded))
        encoded = torch.nn.functional.sigmoid(self.l2(encoded))

        return encoded

model = MltcModel(26, 256, 512, rnn_num_layers=2)
criterion = torch.nn.MultiLabelSoftMarginLoss()
optimizer = torch.optim.Adam(model.parameters())

for epoch in range(100):
    optimizer.zero_grad()
    output = model(dataset["input"])
    loss = criterion(output, dataset["target"])
    loss.backward()
    optimizer.step()
    with torch.no_grad():
        acc = torch.abs(output - dataset["target"]).view(-1)
        acc = acc.sum() / acc.size()[0] * 100.
        print(f'Epoch({epoch+1}) loss: {loss.item()}, accuracy: {acc:.1f}%')

And this is the print output I’m getting from it:

Epoch(1) loss: 0.7223548293113708, accuracy: 49.9%
Epoch(2) loss: 0.6718186736106873, accuracy: 41.9%
Epoch(3) loss: 0.6158247590065002, accuracy: 30.7%
Epoch(4) loss: 0.5509805083274841, accuracy: 13.8%
Epoch(5) loss: 0.5162160992622375, accuracy: 3.7%
Epoch(6) loss: 0.5061318278312683, accuracy: 0.8%
Epoch(7) loss: 0.5038571357727051, accuracy: 0.2%
Epoch(8) loss: 0.5033745169639587, accuracy: 0.0%
Epoch(9) loss: 0.5032582879066467, accuracy: 0.0%
Epoch(10) loss: 0.5032244920730591, accuracy: 0.0%
Epoch(11) loss: 0.5032129883766174, accuracy: 0.0%
Epoch(12) loss: 0.5032084584236145, accuracy: 0.0%
Epoch(13) loss: 0.5032065510749817, accuracy: 0.0%
Epoch(14) loss: 0.5032057166099548, accuracy: 0.0%
Epoch(15) loss: 0.5032051205635071, accuracy: 0.0%
Epoch(16) loss: 0.503204882144928, accuracy: 0.0%
Epoch(17) loss: 0.5032047629356384, accuracy: 0.0%
Epoch(18) loss: 0.5032046437263489, accuracy: 0.0%
Epoch(19) loss: 0.5032045841217041, accuracy: 0.0%
Epoch(20) loss: 0.5032045841217041, accuracy: 0.0%
...

The dataset is supposed to be an encoded version of some sentences that I’m trying to tell whether they are toxic or not (there are a bunch of other labels that I omitted here). I know it does not make much sense in this form, but I was hoping to be able to overfit into such a small dataset easily. But instead, it went south.

Thanks.

I spent some time on this and apparently, I should have done that in the first place. The problem is pretty simple. My metric formula is wrong! It should have been:

acc = (1. - acc.sum() / acc.size()[0]) * 100.