Implementing trainable parameters with norm=1

Mckinsey666 · July 3, 2020, 2:47am

I am trying to implement trainable parameters with norm always=1.
Basically, I want to optimize the following variable p directly:

p = Variable(torch.zeros(15), requires_grad = True)
optimizer = optim.SGD([p], lr=0.1)

Every training step, p will be updated, but to prevent the values of p of exploding, I want to normalize p every step to have l2norm = 1 (but still trainable through the optimizer). Is this possible?

Thank you all!

albanD · July 4, 2020, 8:16pm

Hi,

You have two options here:

Projected gradient descent. Where after each step, you project p onto the space you want (here you want p = p / p.norm(1))
Use unconstrained version by learning p just like you do right now, but when you use it, use actual_p = p / p.norm(1). This way, you make sure that actual_p is always of the right norm (whatever the value of p is).