I have been working with pretrained embeddings (Glove) and would like to allow these to be finetuned. I currently use embeddings like this:
word_embeddingsA = nn.Embedding(vocab_size, embedding_length)
word_embeddingsA.weight = nn.Parameter(TEXT.vocab.vectors, requires_grad=False)
Should I simply set requires_grad=True to allow the embeddings to be trained? Or should I do something like this
word_embeddingsA = nn.Embedding.from_pretrained(TEXT.vocab.vectors, freeze=False)
Are these equivalent, and do I have a way to check that the embeddings are getting trained?
The approaches should yield the same result (if you use requires_grad=True
in the first approach).
To make sure this layer is trained, you could check the gradients after the backward
call via:
print(model.word_embeddings.weight.grad)
and you should see valid gradients.
If you are seeing None
as the return value, the computation graph might have been detached at some point.
1 Like
Ok thanks. If I now want to infer using the trained weights can I still do
test_sen1 = TEXT.preprocess(test_sen1)
test_sen1 = [[TEXT.vocab.stoi[x] for x in test_sen1]]
test_sen1 = np.asarray(test_sen1)
test_sen1 = torch.LongTensor(test_sen1)
test_tensor1 = Variable(test_sen1, volatile=True)
output = model(test_tensor1,1)
I suppose this may still be ok as the TEXT object is just supplying the indices?
Iām not familiar enough with torchtext
unfortunately.
Could you test this code snippet using some training examples and run a sanity check to see, if the predictions are expected?
PS: Variables
are deprecated since PyTorch 0.4
, so you can use tensors now.
To save memory, wrap the inference code in a with torch.no_grad()
block.