Selectively allow some indices in Embedding to be learned

vgel · August 7, 2018, 3:55am

I’m working on a dependency tree oracle network, and I’m using the pretrained 50d GLOVE vectors. However, as is common in CL, I also have special tokens – in my case, for a situation where a feature refers to a word that’s not there (e.g. the dependency of a leaf word), for unknown words, and for the root of the tree. I want the network to learn these special tokens vectors starting at the first epoch, but I want to freeze the other non-special token vectors until the network has learned enough to fine-tune them.

My code is set up so these special tokens are indices 0…2 of vocab[], and I initialize the embedding like so (I’m on torch 0.3.1 so I can’t use Embedding.load_pretrained):

word_embedding = nn.Embedding(len(vocab), vocab_embedding_size)
word_embedding.load_state_dict({'weight': torch.cuda.FloatTensor([
    get_pretrained(k) or ((np.random.random(vocab_embedding_size) * 2 - 1) * 0.01)
    for k in vocab
])})

my forward() method then looks like this:

def forward(self, s_w_idxs, ...):
    s_w_embeds = self.word_embeddings(s_w_idxs).view((1, -1))
    ....

Is there a good way to get this to work? I’m a definite newbie with PyTorch so nothing is coming to mind – I could maybe map over the tensor and mask out the values before passing it into word_embeddings but I’m not sure exactly how to do that.

Thanks!