Implementing trainable parameters with norm=1

I am trying to implement trainable parameters with norm always=1.
Basically, I want to optimize the following variable p directly:

p = Variable(torch.zeros(15), requires_grad = True)
optimizer = optim.SGD([p], lr=0.1)

Every training step, p will be updated, but to prevent the values of p of exploding, I want to normalize p every step to have l2norm = 1 (but still trainable through the optimizer). Is this possible?

Thank you all!


You have two options here:

  • Projected gradient descent. Where after each step, you project p onto the space you want (here you want p = p / p.norm(1))
  • Use unconstrained version by learning p just like you do right now, but when you use it, use actual_p = p / p.norm(1). This way, you make sure that actual_p is always of the right norm (whatever the value of p is).