I’ve written a very simple CBOW model, like so:
class CBOW(torch.nn.Module): def __init__(self, vocab_size, embedding_dim=200): super(CBOW, self).__init__() self.embeddings = nn.Embedding(vocab_size, embedding_dim) self.lin = nn.Linear(embedding_dim, vocab_size) self.activation = nn.LogSoftmax(dim=1) def forward(self, inputs): embeds = self.embeddings(inputs) out = torch.mean(embeds, dim=1) out = self.activation(self.lin(out)) return out model = CBOW(len(dataset.word2index)) criterion = nn.NLLLoss() optimizer = torch.optim.SGD(model.parameters(), lr=0.1, momentum=0.5) def train(epoch): model.train() for batch_idx, (data, target) in enumerate(train_loader): data, target = Variable(data), Variable(target) optimizer.zero_grad() output = model(data) loss = criterion(output, target) loss.backward() optimizer.step() if __name__ == '__main__': training_epochs = 10 for epoch in range(training_epochs): train(epoch)
Now, I want to implement negative sampling, so I need a way to only update the weights corresponding to certain rows in the output. I know that one way to do this is I can manually zero the gradients corresponding to the rows in the output that I don’t want the weights updated, as suggested here:
However, someone mentioned to me that there is an easier way to do this that is “built-in” to PyTorch? Can anyone tell me if this is true? Is there a built-in way to choose which weights I want to update?