Hi, I am using a network to embed some entity into vector space. As the length of the vector decrease during the training. I want to normalize it’s length to 1 in the end of each step. Is there any tool that I can use to normalize the embedding vectors?

I think the best thing you can do is to save the embedded indices, and normalize their rows manually after the update (just index_select them, compute row-wise norm, divice, index_copy back into weights). We only support automatic max norm clipping.

If you want to normalize a vector as a part of a model, this should do it:

assume q is the tensor to be L2 normalized, along dim 1

qn = torch.norm(q, p=2, dim=1).detach()

q = q.div(qn.expand_as(q))

Note the `detach()`

, that is essential for the gradients to work correctly. I’m assuming you want the norm to be treated as a constant while dividing the Tensor with it.

Can I use it to normalise the embedding after each update in the training ?

Yes it could be used to normalize an embedding. I suggest not using the `detach()`

though. I found that it degrades performance.

I see, thanks. But I need to do the constraints on the norms of embeddings( not bigger than 1 ). Do you have any better suggestions to do it?

why do we have to use `detach()`

? Also in the new PyTorch version, you have to use `keepdim=True`

in the `norm()`

method. A simple implementation of L2 normalization:

```
# suppose x is a Variable of size [4, 16], 4 is batch_size, 16 is feature dimension
x = Variable(torch.rand(4, 16), requires_grad=True)
norm = x.norm(p=2, dim=1, keepdim=True)
x_normalized = x.div(norm.expand_as(x))
```

If you use `keepdim=True`

, you don’t even need `expand_as(x)`

. The following works for me:

```
norm = x.norm(p=2, dim=1, keepdim=True)
x_normalized = x.div(norm)
```

Now PyTorch have a normalize function, so it is easy to do L2 normalization for features. Suppose `x`

is feature vector of size `N*D`

(`N`

is batch size and `D`

is feature dimension), we can simply use the following

```
import torch.nn.functional as F
x = F.normalize(x, p=2, dim=1)
```

Yes, it failed quickly with `detach()`

What if the variable I am trying to normalize is in fact a `Parameter `

from `nn.parameter`

rather than a `Variable `

?

Then in this case I get the following error for any of the above options I try:

**TypeError: cannot assign ‘torch.autograd.variable.Variable’ as parameter ‘W’ (torch.nn.Parameter or None expected)**

```
class myUnit(nn.Module):
def __init__(self,myParameter):
super(myUnit, self).__init__()
self.myParameter = Parameter(myParameter,requires_grad=True)
def forward(self,input):
"""
Whatever operation. Just an example:
"""
self.myParameter = F.normalize(self.myParameter,p=2,dim=1)
output = self.myParameter * input - 1
return output
```

Can I still keep using `Parameter `

in my customized network and be able to also normalize it?

May be you can try:

```
self.myParameter.weight.data = F.normalize(self.myParameter.weight.data, p=2, dim=1)
```

@sssohrab I’m facing the exact same issue. Can you please let me how exactly you got through this? Also, did you want the gradients to pass through the norm step?