Hi, I’m following on the PyTorch tutorial given by the official pytorch site
and I’m on the step towards building network of N-gram language model.
And then I started to wonder several stuffs which I might have glossed over without cares
like, nn.Embedding, loss function and stuff.
So I have two questions to ask.
I believe word embeddings are to be trained but fixed, but I get the exact same value for the word embedding as the example with my followed code. How does the nn.Embedding work?
I searched the official document on the site and it says
weight (Tensor) – the learnable weights of the module of shape (num_embeddings, embedding_dim)
so, does that mean whenever I have my datasets and word embedding dimension, it will calculate different embedding everytime I run the program?
How does the loss function works?
def forward(self, inputs) :
embeds = self.embeddings(inputs).view(1, -1)
out = F.relu(self.linear1(embeds))
#out = F.relu(self.linear2(out))
out = self.linear3(out)
log_probs = F.log_softmax(out)
loss_function = nn.NLLLoss() # Negative log-Likelihood Loss
log_probs = model(context_var)
loss = loss_function(log_probs, Variable(torch.LongTensor([word_to_ix[target]])))
total_loss += loss.data
Above are my source code. Does the loss_function catches the index of the target and sets the probability as 1 and adjust the parameters of all related weights of the loss Variable holds in a way that the negative log-likelihood loss could most likely to be minimized?