Hi, I am using a network to embed some entity into vector space. As the length of the vector decrease during the training. I want to normalize it’s length to 1 in the end of each step. Is there any tool that I can use to normalize the embedding vectors?

# How to normalize embedding vectors?

**apaszke**(Adam Paszke) #2

I think the best thing you can do is to save the embedded indices, and normalize their rows manually after the update (just index_select them, compute row-wise norm, divice, index_copy back into weights). We only support automatic max norm clipping.

**samarth-robo**(Samarth Brahmbhatt) #3

If you want to normalize a vector as a part of a model, this should do it:

assume q is the tensor to be L2 normalized, along dim 1

qn = torch.norm(q, p=2, dim=1).detach()

q = q.div(qn.expand_as(q))

Note the `detach()`

, that is essential for the gradients to work correctly. I’m assuming you want the norm to be treated as a constant while dividing the Tensor with it.

**samarth-robo**(Samarth Brahmbhatt) #5

Yes it could be used to normalize an embedding. I suggest not using the `detach()`

though. I found that it degrades performance.

**asdass**#6

I see, thanks. But I need to do the constraints on the norms of embeddings( not bigger than 1 ). Do you have any better suggestions to do it?

**jdhao**(jdhao) #7

why do we have to use `detach()`

? Also in the new PyTorch version, you have to use `keepdim=True`

in the `norm()`

method. A simple implementation of L2 normalization:

```
# suppose x is a Variable of size [4, 16], 4 is batch_size, 16 is feature dimension
x = Variable(torch.rand(4, 16), requires_grad=True)
norm = x.norm(p=2, dim=1, keepdim=True)
x_normalized = x.div(norm.expand_as(x))
```

**moi90**(Martin Schröder) #8

If you use `keepdim=True`

, you don’t even need `expand_as(x)`

. The following works for me:

```
norm = x.norm(p=2, dim=1, keepdim=True)
x_normalized = x.div(norm)
```

**jdhao**(jdhao) #9

Now PyTorch have a normalize function, so it is easy to do L2 normalization for features. Suppose `x`

is feature vector of size `N*D`

(`N`

is batch size and `D`

is feature dimension), we can simply use the following

```
import torch.nn.functional as F
x = F.normalize(x, p=2, dim=1)
```

**sssohrab**(Sohrab Ferdowsi ) #11

What if the variable I am trying to normalize is in fact a `Parameter `

from `nn.parameter`

rather than a `Variable `

?

Then in this case I get the following error for any of the above options I try:

**TypeError: cannot assign ‘torch.autograd.variable.Variable’ as parameter ‘W’ (torch.nn.Parameter or None expected)**

```
class myUnit(nn.Module):
def __init__(self,myParameter):
super(myUnit, self).__init__()
self.myParameter = Parameter(myParameter,requires_grad=True)
def forward(self,input):
"""
Whatever operation. Just an example:
"""
self.myParameter = F.normalize(self.myParameter,p=2,dim=1)
output = self.myParameter * input - 1
return output
```

Can I still keep using `Parameter `

in my customized network and be able to also normalize it?