I’ve written a very simple CBOW model, like so:
class CBOW(torch.nn.Module):
def __init__(self, vocab_size, embedding_dim=200):
super(CBOW, self).__init__()
self.embeddings = nn.Embedding(vocab_size, embedding_dim)
self.lin = nn.Linear(embedding_dim, vocab_size)
self.activation = nn.LogSoftmax(dim=1)
def forward(self, inputs):
embeds = self.embeddings(inputs)
out = torch.mean(embeds, dim=1)
out = self.activation(self.lin(out))
return out
model = CBOW(len(dataset.word2index))
criterion = nn.NLLLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=0.1, momentum=0.5)
def train(epoch):
model.train()
for batch_idx, (data, target) in enumerate(train_loader):
data, target = Variable(data), Variable(target)
optimizer.zero_grad()
output = model(data)
loss = criterion(output, target)
loss.backward()
optimizer.step()
if __name__ == '__main__':
training_epochs = 10
for epoch in range(training_epochs):
train(epoch)
Now, I want to implement negative sampling, so I need a way to only update the weights corresponding to certain rows in the output. I know that one way to do this is I can manually zero the gradients corresponding to the rows in the output that I don’t want the weights updated, as suggested here:
However, someone mentioned to me that there is an easier way to do this that is “built-in” to PyTorch? Can anyone tell me if this is true? Is there a built-in way to choose which weights I want to update?