Sequence multilabel classification accuracy problem

I’m trying to implement a multi-label classification task, and currently my model has Embedding, GRU, 2x Linear layers.

I have padded the data, and its shape is (seq_len x batch) where seq_len is the longest sequence in that batch. Targets are multi-hot encoded as I’m using BCEWithLogitsLoss.

I have a weird issue that when using batch size > 1, I get much lower accuracy (0.3) when using batch size = 1 (0.8). I suspected it might be padding thing, but I was also able to reproduce this with same length sequences. I’m trying my luck here if anyone has encountered something similar and what the problem was?

How large is the batch size for the low accuracy?
Huge batch sizes tend to give a worse accuracy, but the gap seems to be too large for it.

Could you post the code of your model and training routine, so that we could have a look, if maybe a slicing operation is wrong?

I assume your input has the size [batch_size, seq_len]?
If so, then self.gru would get an input of [batch_size, seq_len, embedded_dim], while it expects an input of [seq_len, batch_size, input_size] in the default setup.

If my assumptions are correct, you could either permute the input or use batch_first=True while creating the nn.GRU, which would then expect an input of [batch_size, seq_len, features].

Actually it is seq_len x batch_size. I printed shapes for all layers:

inputs shape:  torch.Size([377, 2])
embeds shape:  torch.Size([377, 2, 100])
hidden[0] shape:  torch.Size([2, 64])
output shape:  torch.Size([2, 126])

The shapes look correct.
I’m currently unsure, what might be wrong :confused:
Did you verify that your collate_fn is creating valid batches?

Afaik yes. But anyway this helps as at least it’s not some trivial bug in the model, but might be related to something else. Thanks for your help!