Minimizing distance between a latent vector and its context latent vectors

MANSUM · October 24, 2017, 9:11am

Suppose I have myNetwork class defined as:

class myNetwork(nn.Module):
    def __init__(self, a, b, embedding_dim):
        super(modeler, self).__init__()
        self.embed1 = nn.Embedding(a, embedding_dim)
        self.embed2 = nn.Embedding(b, embedding_dim)

        
    def forward(self, idx1, idx2, context_idxs1, context_idxs2):
        
        embeds1 = self.embed1(idx1)
        embeds2 = self.embed2(idx2)

        embeds1_context = self.embed1(context_idxs1)
        embeds2_context = self.embed2(context_idxs2)

        embeds1_context  = embeds1_context.sum(1) / embeds1_context.data.nelement()
        embeds2_context  = embeds2_context.sum(1) / embeds2_context.data.nelement()
        
        loss1 = torch.sum((embeds1 - embeds1_context)**2) / embeds1.data.nelement()
        loss2 = torch.sum((embeds2 - embeds2_context)**2) / embeds2.data.nelement()

        output = torch.sum((embeds1- embeds2)**2,1)

        return output, loss1, loss2

In the training loop, I am going call the forward function twice (one for positive example and the other for negative example) and I am going to minimize the margin loss:

def training(self):
    model = myNetwork()
        
    criterion = nn.MarginRankingLoss(margin=self.margin))

    optimizer = optim.Adam(model.parameters(), lr = self.lRate)
        
    for epoch in range(numEpoch):
        for batch_idx, batch in enumerate(train_loader):
            pos, loss1_pos, loss_2_pos = model(xxx)
            neg, loss1_neg, loss_2_neg = model(xxx)

            loss = criterion_trans(pos, neg, Variable(torch.FloatTensor([-1])))

            loss = loss + (loss1_pos + loss_2_pos + loss1_neg + loss_2_neg)
                
            loss.backward()
            optimizer.step()

From the above code, I wanted to 1) compute margin based loss between positive sample and negative sample, 2) Minimizing distance between a latent vector and its context latent vectors. (e.g., (embeds1 - embeds1_context)^2, where I also want to update indices of embed1 that generated embeds1_context.

I am sure that loss is computed correctly. But I wonder if loss1_pos, loss_2_pos, loss1_neg, loss_2_neg are computed as my intention.

Can anyone help me?

SimonW · October 24, 2017, 8:38pm

I didn’t look at the maths in depth. But if you are wondering if using the same module twice will be problematic for pytorch, the answer is no. The autograd engine will properly handle all relevant variables and keep things correct.

MANSUM · October 25, 2017, 1:30am

In fact, my question is that is the way of computing loss1 and loss2 is correct. That is, if (embeds1 - embeds1_context)^2 can be achieved by loss1 = torch.sum((embeds1 - embeds1_context)**2).

richard · October 25, 2017, 2:07am

loss1 = torch.sum((embeds1 - embeds1_context)**2) computes the mean-squared error loss, yes.

MANSUM · October 25, 2017, 2:14am

So for example, let’s say embed1 is the embedding vector of idx1 = 5 and embeds1_context is the weighted sum of the embedding vectors of context_idxs1 = [2,4,6,8].

What I want to do is that by computing the loss, loss1 = torch.sum((embeds1 - embeds1_context)**2), I want to update not only the 5th index of self.embed1, but also the 2,4,6,8th indices of self.embed1.

Will I achieve this by the above code?

I think my previous questions were a little bit unclear.

SimonW · October 25, 2017, 2:58pm

Yes, it should update those as well. Is there any part in our documentation that implies it won’t? Let us know so we can update it!