How to normalize embedding vectors?

Hi, I am using a network to embed some entity into vector space. As the length of the vector decrease during the training. I want to normalize it’s length to 1 in the end of each step. Is there any tool that I can use to normalize the embedding vectors?

13 Likes

I think the best thing you can do is to save the embedded indices, and normalize their rows manually after the update (just index_select them, compute row-wise norm, divice, index_copy back into weights). We only support automatic max norm clipping.

2 Likes

If you want to normalize a vector as a part of a model, this should do it:

assume q is the tensor to be L2 normalized, along dim 1

qn = torch.norm(q, p=2, dim=1).detach()
q = q.div(qn.expand_as(q))

Note the detach(), that is essential for the gradients to work correctly. I’m assuming you want the norm to be treated as a constant while dividing the Tensor with it.

18 Likes

Can I use it to normalise the embedding after each update in the training ?

1 Like

Yes it could be used to normalize an embedding. I suggest not using the detach() though. I found that it degrades performance.

1 Like

I see, thanks. But I need to do the constraints on the norms of embeddings( not bigger than 1 ). Do you have any better suggestions to do it?

why do we have to use detach()? Also in the new PyTorch version, you have to use keepdim=True in the norm() method. A simple implementation of L2 normalization:

# suppose x is a Variable of size [4, 16], 4 is batch_size, 16 is feature dimension
x = Variable(torch.rand(4, 16), requires_grad=True)
norm = x.norm(p=2, dim=1, keepdim=True)
x_normalized = x.div(norm.expand_as(x))
4 Likes

If you use keepdim=True, you don’t even need expand_as(x). The following works for me:

norm = x.norm(p=2, dim=1, keepdim=True)
x_normalized = x.div(norm)
5 Likes

Now PyTorch have a normalize function, so it is easy to do L2 normalization for features. Suppose x is feature vector of size N*D (N is batch size and D is feature dimension), we can simply use the following

import torch.nn.functional as F
x = F.normalize(x, p=2, dim=1)
30 Likes

Yes, it failed quickly with detach()

What if the variable I am trying to normalize is in fact a Parameter from nn.parameter rather than a Variable ?

Then in this case I get the following error for any of the above options I try:

TypeError: cannot assign ‘torch.autograd.variable.Variable’ as parameter ‘W’ (torch.nn.Parameter or None expected)

class myUnit(nn.Module):
    def __init__(self,myParameter):
        super(myUnit, self).__init__()
        self.myParameter = Parameter(myParameter,requires_grad=True)
    def forward(self,input):
        """
        Whatever operation. Just an example:
        """
        self.myParameter = F.normalize(self.myParameter,p=2,dim=1)
        output = self.myParameter * input - 1
        return output

Can I still keep using Parameter in my customized network and be able to also normalize it?

May be you can try:

self.myParameter.weight.data = F.normalize(self.myParameter.weight.data, p=2, dim=1)
1 Like

@sssohrab I’m facing the exact same issue. Can you please let me how exactly you got through this? Also, did you want the gradients to pass through the norm step?

@ptrblck Could you please help answer this question from @sssohrab? I am facing the same issue (“RuntimeError: Expected to have finished reduction in the prior iteration before starting a new one. This error indicates that your module has parameters that were not used in producing loss.”)
Thank you so much in advance for your help!

Your issue seems to be unrelated to the linked post, as your error points towards an error running a DDP model.

Sorry I copied the wrong error message. The above message was for the following code:

self.embeddings = torch.nn.Parameter(F.normalize(self.embeddings, p=2, dim=-1))

What I would like to do is to constraint the weights of my model to always lie on a sphere. And I finally found the solution:

self.embeddings.data.copy_(torch.nn.Parameter(F.normalize(self.embeddings.data, p=2, dim=-1)))

Hope that this will be useful for future readers.