I have a multi-class classification dataset like the below. The number of class is five.
Input: i like summer. the weather is nice.
Output: 3 // WEATHER class index
Then, I created a model. This is working without bugs. But I am not sure that the accuracy is only high when I set batch-size 1. If I set a little larger number (such as 3, 5, 10), the accuracy drops more than 30%.
When I use mini-batch, I padded 0
to fix the length of sequences. Then, make it reversed. For example, the following minibatch is 53x3 (Length x batch_size).
I assume that padding makes something wrong in my model. Or, is it common phenomenon in sequence training?
0 17 0
0 484 0
0 481 0
0 605 0
0 539 0
0 675 0
0 640 0
539 334 0
126 44 0
699 216 0
256 570 0
334 688 539
578 251 126
525 3 563
295 8 256
27 525 334
578 131 701
87 63 578
235 457 71
334 205 525
119 386 457
444 95 35
class Simple(nn.Module):
def __init__(self, vocab_size, embd_size, hidden_size, class_size):
super(Simple, self).__init__()
self.embd = nn.Embedding(vocab_size, embd_size, padding_idx=0)
self.ctx_encoder = nn.GRU(embd_size, hidden_size, bidirectional=True)
self.decoder = nn.Linear(hidden_size*2, hidden_size)
self.last_layer = nn.Linear(hidden_size, class_size)
def forward(self, x):
'''
x: (L, bs) batch first is False
'''
batch_size = x.size(1)
x = self.embd(x) # (bs, L, E)
_, h = self.ctx_encoder(x) # (L, bs, 2H), (2, bs, H)
h = h.view(batch_size, -1) # (bs, 2H)
out = self.decoder(h) # (bs, H)
out = self.last_layer(out) # (bs, class_size)
return F.log_softmax(out, -1)