Thanks so much. Do you know if there’s any easy way to vectorize this kind of selection operation?
If you have a mask of the cells that should be frozen, and two full embedding matrices, one frozen and one dyamic, you could write:
dynamic = dynamic * mask + frozen
Frozen cells contain a value for frozen parameters, 0 elsewhere. Mask contains 0 wherever a frozen paramter should be used, 1 elsewhere.
You can build the mask and frozen matrix during initialization.
2 Likes
Thanks for the help from you all!
I also wrote a short snippet that shows how to load pre-trained embeddings from SpaCy to nn.Embedding
.
Hope this help!
import spacy
nlp = spacy.load('en_core_web_md')
import torch
import torch.nn as nn
import numpy as np
n_vocab, vocab_dim = nlp.vocab.vectors.shape
emb = nn.Embedding(n_vocab, vocab_dim)
# Load pretrained embeddings
emb.weight.data.copy_(torch.from_numpy(nlp.vocab.vectors.data))
# --- Equivilent test for Spacy.nlp and torch.embeddings ---
test_vocab = ['apple', 'bird', 'cat', 'dog', 'egg', 'e12dsafdsf1']
# dict for converting vocab to row index for word vector matrix
key2row = nlp.vocab.vectors.key2row
for v in test_vocab:
vocab_id = nlp.vocab.strings[v]
spacy_vec = nlp.vocab[v].vector
row = key2row.get(vocab_id, None)
if row is None:
print('{} is oov'.format(v))
continue
vocab_row = torch.tensor(row, dtype=torch.long)
embed_vec = emb(vocab_row)
print(np.allclose(spacy_vec_cat, emb_vec_cat.detach().numpy()))
7 Likes
Load pre-trained GloVe embeddings
import torchtext
glove = torchtext.vocab.GloVe(name='6B', dim=300)
embedding_layer = nn.Embedding.from_pretrained(glove.vectors)