I use an embedding layer to project one-hot indices to continuous space. However, during the training, I don’t want to update the weight of it. How could I do that?
you can set the weight of the embedding layer to not require grad.
m = nn.Embedding(...)
m.weight.requires_grad=False
4 Likes
Oh, I see. Thank you very much.
Oh, sorry. After setting Embedding.weit.requires_grad = False
, an error was raised.
ValueError: optimizing a parameter that doesn't require gradients
The optimizer is used in the following way:
self.optimizer = optim.Adadelta(self.model.parameters(), lr=args.learning_rate)
And, model is defined as follow:
class DecomposableModel(nn.Module):
def __init__(self, word_embedding, config):
super(DecomposableModel, self).__init__()
self.name = 'DecomposableModel'
self.drop_p = config['drop_p']
self.word_dim = word_embedding.embeddings.size(1)
self.embedding = nn.Embedding(word_embedding.embeddings.size(0), self.word_dim)
self.embedding.weight = nn.Parameter(word_embedding.embeddings)
self.embedding.weight.requires_grad = False
# self.embedding_normalize()
self.F = nn.Linear(self.word_dim, config['F_dim'])
self.G = nn.Linear(2 * self.word_dim, config['G_dim'])
self.H = nn.Linear(2 * config['G_dim'], config['relation_num'])
self.cuda_flag = config['cuda_flag']
def forward(self, p_ids, h_ids):
......
Hi,
Please see this post Freeze the learnable parameters of resnet and attach it to a new network that solves exactly your problem.