Indexing embeddings, matrix factorization

jbuddy_13 · October 1, 2019, 5:25pm

Hello all,

I’m trying to execute a modified version of matrix factorization via deep learning. The goal is to find a matrix of shape [9724x300] where the rows are items and there are (arbitrarily) 300 features. The function would be optimized when the dot product of vector_i and vector_j are really, really close to the value in the interaction matrix, Xij.

Xij has dimensions [9724x9724] where the value in cell [0,1] is the number of users who liked both items i and j. Ergo, when optimized, Vi*VjT should be really close to the number of users who liked both items i and j.

I’ve modified this code from a tutorial. The key difference is that in this resource, the author has a user-to-item matrix, not an item-to-item matrix.

I’m stuck trying to index vectors i and j in tensor self.vectors. It appears the datatype did not match what was expected, despite making i and j into LongTensors.

Any feedback would be appreciated!

import torch
from torch.autograd import Variable

class MatrixFactorization(torch.nn.Module):
    def __init__(self, n_items=len(movie_ids), n_factors=300):
        super().__init__()
        
        self.vectors = nn.Embedding(n_items, n_factors,sparse=True)
        

    def forward(self, i,j):
        return (self.vectors([i])*torch.transpose(self.vectors([j]))).sum(1)

    def predict(self, i, j):
        return self.forward(i, j)

model = MatrixFactorization(n_items= len(movie_ids),n_factors=300)
loss_fn = nn.MSELoss() 
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)

for i in range(len(movie_ids)):
    for j in range(len(movie_ids)):
    # get user, item and rating data
        rating = Variable(torch.FloatTensor([Xij[i, j]]))
        # predict
        i = Variable(torch.LongTensor([int(i)]))
        j = Variable(torch.LongTensor([int(j)]))
        prediction = model(i, j)
        loss = loss_fn(prediction, rating)

        # backpropagate
        loss.backward()

        # update weights
        optimizer.step()

And the error I receive:

TypeError: embedding(): argument 'indices' (position 2) must be Tensor, not list

albanD · October 1, 2019, 5:53pm

Hi,

Variables are not needed anymore. and predict methods should not be needed. An updated version of your code is:

import torch
from torch.autograd import Variable

class MatrixFactorization(torch.nn.Module):
    def __init__(self, n_items=len(movie_ids), n_factors=300):
        super().__init__()
        
        self.vectors = nn.Embedding(n_items, n_factors,sparse=True)
        

    def forward(self, i,j):
        # i and j are LongTensors here of size (batch)
        feat_i = self.vectors(i)
        feat_j = self.vectors(j)
        # feat_i and feat_j are of size (batch, n_factors)
        # Since you only want the interactions element-wise
        # for i and j in the batch dimension, you want diag(feat_i * feat_j.t())
        # This can be efficiently computed using element-wise product and summing
        result = (feat_i * feat_j).sum(-1)
        # result is of size (batch)
        return result


model = MatrixFactorization(n_items= len(movie_ids),n_factors=300)
loss_fn = nn.MSELoss() 
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)

for i in range(len(movie_ids)):
    for j in range(len(movie_ids)):
    # get user, item and rating data
        rating =torch.FloatTensor([Xij[i, j]])
        # predict
        i = torch.LongTensor([int(i)])
        j = torch.LongTensor([int(j)])
        prediction = model(i, j)
        loss = loss_fn(prediction, rating)

        # Reset the gradients to 0
        optimizer.zero_grad()

        # backpropagate
        loss.backward()

        # update weights
        optimizer.step()

Note that with this code you can do real batch size and have:

# Generate a batch of 4 pairs i,j
i = torch.LongTensor([1, 3, 4, 5])
j = torch.LongTensor([2, 4, 6, 7])

# Get the ground truth and do forward for all of them at once
ratings = X[i, j]
pred = model(i, j)
# By default, MSELoss with compute the average MSELoss in the batch (you can change that if your need, check the doc how to do so)
loss = loss_fn(pred, ratings)

jbuddy_13 · October 1, 2019, 6:37pm

Thank you much for your detailed response!
Can you elaborate on the last block of code beginning at

# Generate a batch of 4 pairs i,j

Does this code block replace the nested for loop?
And what is the significance of the pairs, just an example? Or will this sample large batches of data?

i = torch.LongTensor([1, 3, 4, 5])
j = torch.LongTensor([2, 4, 6, 7])

Thanks @albanD!

albanD · October 1, 2019, 6:43pm

Pytorch has been built to work with batch of data. That allows to perform larger operations and thus better use device such as GPUs.
A batch is basically a bunch of independent data that you process at the same time.

In your case, if len(movie_ids) == 2, you have 4 pairs of indices to evaluate: (0,0), (0,1), (1,0), (1,1).
You can do this using your nested for loops or by doing a single forward with a batch of 4 samples with i = torch.LongTensor([0, 0, 1, 1]) and j=torch.LongTensor([0, 1, 0, 1]). That way, in one call to your model, you get the result for all four samples, instead of doing 4 calls to your model with the for loops.
Loss functions such as MSE are built to support such examples and you can see in the doc that it has a special argument reduction to choose how the individual values for each element in your batch should be aggregated into the final loss value.

jbuddy_13 · October 1, 2019, 7:32pm

Awesome! I tried the batch method you’ve described, and the following warning was returned:

/anaconda3/lib/python3.6/site-packages/torch/nn/modules/loss.py:431: UserWarning: Using a target size (torch.Size([1])) that is different to the input size (torch.Size([5931640])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size.
  return F.mse_loss(input, target, reduction=self.reduction)

However, the code hasn’t thrown an error (yet) so maybe all is well.

albanD · October 1, 2019, 7:39pm

Check the size of the elements you give to your loss.
One has the full batch size, but the other is 1. They should both be of size batch size. So something is wrong somewhere

jbuddy_13 · October 2, 2019, 12:11am

Found it
Thanks again for your help!