# How to normalize embedding vectors?

(Maplewizard) #1

Hi, I am using a network to embed some entity into vector space. As the length of the vector decrease during the training. I want to normalize it’s length to 1 in the end of each step. Is there any tool that I can use to normalize the embedding vectors?

Normalizing Embeddings

I think the best thing you can do is to save the embedded indices, and normalize their rows manually after the update (just index_select them, compute row-wise norm, divice, index_copy back into weights). We only support automatic max norm clipping.

(Samarth Brahmbhatt) #3

If you want to normalize a vector as a part of a model, this should do it:

assume q is the tensor to be L2 normalized, along dim 1

qn = torch.norm(q, p=2, dim=1).detach()
q = q.div(qn.expand_as(q))

Note the `detach()`, that is essential for the gradients to work correctly. I’m assuming you want the norm to be treated as a constant while dividing the Tensor with it.

#4

Can I use it to normalise the embedding after each update in the training ?

(Samarth Brahmbhatt) #5

Yes it could be used to normalize an embedding. I suggest not using the `detach()` though. I found that it degrades performance.

#6

I see, thanks. But I need to do the constraints on the norms of embeddings( not bigger than 1 ). Do you have any better suggestions to do it?

(jdhao) #7

why do we have to use `detach()`? Also in the new PyTorch version, you have to use `keepdim=True` in the `norm()` method. A simple implementation of L2 normalization:

``````# suppose x is a Variable of size [4, 16], 4 is batch_size, 16 is feature dimension
norm = x.norm(p=2, dim=1, keepdim=True)
x_normalized = x.div(norm.expand_as(x))
``````

(Martin Schröder) #8

If you use `keepdim=True`, you don’t even need `expand_as(x)`. The following works for me:

``````norm = x.norm(p=2, dim=1, keepdim=True)
x_normalized = x.div(norm)
``````

(jdhao) #9

Now PyTorch have a normalize function, so it is easy to do L2 normalization for features. Suppose `x` is feature vector of size `N*D` (`N` is batch size and `D` is feature dimension), we can simply use the following

``````import torch.nn.functional as F
x = F.normalize(x, p=2, dim=1)
``````

(Liang) #10

Yes, it failed quickly with `detach()`

(Sohrab Ferdowsi ) #11

What if the variable I am trying to normalize is in fact a `Parameter ` from `nn.parameter` rather than a `Variable ` ?

Then in this case I get the following error for any of the above options I try:

TypeError: cannot assign ‘torch.autograd.variable.Variable’ as parameter ‘W’ (torch.nn.Parameter or None expected)

``````class myUnit(nn.Module):
def __init__(self,myParameter):
super(myUnit, self).__init__()
def forward(self,input):
"""
Whatever operation. Just an example:
"""
self.myParameter = F.normalize(self.myParameter,p=2,dim=1)
output = self.myParameter * input - 1
return output``````

Can I still keep using `Parameter ` in my customized network and be able to also normalize it?

(Guohai93) #12

May be you can try:

``````self.myParameter.weight.data = F.normalize(self.myParameter.weight.data, p=2, dim=1)
``````