I have seen this error referred here at discuss.pytorch.org, but the proposed solution did not fit my needs.
This is my first project with pytorch, i would start with something a little bit simpler and learning with smaller examples, but i really, really need to make this one project so i chose pytorch .
The goal: Have a text classifier to extract intents from sentences.
I started by using the word2vec from gensim library, and i think i imported correctly the weights , vocab and indexes .
Current situation/ Code
class IntentLSTM(nn.Module): def __init__(self, vocab_size, output_size, embedding_dim, hidden_dim, n_layers, drop_prob=0.5): #Initialize the model by setting up the layers super().__init__() self.output_size = output_size self.n_layers = n_layers self.hidden_dim = hidden_dim #Embedding and LSTM layers self.embedding = nn.Embedding.from_pretrained(weights) self.lstm = nn.LSTM(embedding_dim, hidden_dim, n_layers, dropout=drop_prob, batch_first=True) self.label = nn.Linear(hidden_dim, output_size) self.dropout = nn.Dropout(0.3) self.softmax = nn.LogSoftmax(dim=1) def forward(self, x, hidden): #Embedding and LSTM output embedd = self.embedding(x) lstm_out, hidden = self.lstm(embedd, hidden) lstm_out = lstm_out.contiguous().view(-1, self.hidden_dim) out = self.dropout(lstm_out) sig_out = self.softmax(lstm_out) return sig_out, hidden
The following are the params used to instantiate a model and the Criterion defined:
vocab_size = len(model.vocab) output_size = 4 embedding_dim = 300 hidden_dim = 10 n_layers = 2 net = IntentLSTM(vocab_size, output_size, embedding_dim, hidden_dim, n_layers) for param in net.parameters(): param.requires_grad = True criterion = nn.CrossEntropyLoss() optimizer = torch.optim.Adam(net.parameters(), lr=0.001)
The reason for embedding_dim=300, is because each word from word2vec model has 300 features. The hidden_dim=10 is due to each sentence being 10 words (ints) long. The output_size=4 , as i have 4 possible classes (for classification, but maybe this is wrong too)
The problem occurs next (at cretireon function call) :
for e in range(epochs): # initialize hidden state h = net.init_hidden(batch_size) # batch loop for inputs, labels in train_loader: counter += 1 h = tuple([each.data for each in h]) net.zero_grad() output, h = net(inputs.squeeze(), h) loss = criterion(output.squeeze(), labels.long()) :frowning_face: loss.backward() # `clip_grad_norm` helps prevent the exploding gradient problem in RNNs / LSTMs. nn.utils.clip_grad_norm_(net.parameters(), clip) optimizer.step()
ValueError: Expected input batch_size (210) to match target batch_size (21).
Now, i tried everything i’ve read, and i can’t seem to understand how and why Tensors’ sizes change, but the problem is that i guess (For now) .
This is a print log from forward function:
torch.Size([21, 10, 300])
lstm_out after self.lstm(embedd,hidden)
torch.Size([21, 10, 10])
lstm_out after lstm_out.contiguous…
out after self.dropout(lstm_out)
sig_out after self.softmax(lstm_out)
Why is it that batch_size does not match target_size?
Thank you so very much for any help provided.