Hi, I am new to pytorch, here is my question.
For CoNLL03 NER task, I did the following preprocessing:
- build the vocab, idx2word, word2idx etc.
- build a pre-trained word embedding “E” in a FloatTensor of shape(vocab_size, embedding_dim). I built it according to the idx in my vocab. E.g. if “Hello” has the index 42, then E would be the embedding for “Hello”
Here is my model:
def BiLSTM(nn.Module): def __init__(self, embeddings, embedding_dim): super(BiLSTM, self).__init__() self.word_embeddings = nn.Embedding(num_embeddings=embeddings.shape, embedding_dim=embedding_dim).cuda() self.word_embeddings.from_pretrained(embeddings, freeze=False) #the "embeddings" here is the tensor E I mentioned above ... def forward(self, inputs): #inputs: [batch_size,seq_len], each entry is the index of that token x = self.word_embeddings(inputs) ...
Here I found an issue:
In my understanding, x here should correspond to the first token of the first seq from inputs. It should be the embedding for that 1st word. However, when I printed out x, it was different from the embedding of that 1st word.
It shouldn’t be that the embedding weights were updated because I hadn’t call loss.backward() and optimizer.step() yet. Did I do anything wrong?