# How to do projected gradient descent?

Hi, I want to do a constrained optimization with PyTorch. I want to find the minimum of a function $f(x_1, x_2, \dots, x_n)$, with \sum_{i=1}^n x_i=5 and x_i \geq 0. I think this could be done via Softmax. So I follow the How to do constrained optimization in PyTorch

import torch
from torch import nn

x = torch.rand(2)
lin = nn.Linear(2, 1)
for i in range(100):
y = lin(x)
y.backward()
optimizer.step()

x = nn.Softmax(dim=-1)(x) * 5


If print(y) in each step,the output is:

tensor([-0.2826], grad_fn=<AddBackward0>)
...


The following value does not change.

for i in lin.parameters():
print(i)


The weight of the linear layer is

Parameter containing:
Parameter containing:


So the formula is $y=-0.224x_1-0.1983x_2+0.0823=-0.0257*x_1-0.9092$,whose minimum should be -1.0377. Is the above code wrong? How to modify? Thank you very much.

Hi,

Is there a reason you use softmax instead of clamp? That is not doing the projection of pgd right?

Hiiiii Sakuraiiiii!

The short answer is that Softmax isnâ€™t the right way to enforce your
constraint.

The medium-short answer is that youâ€™re getting almost the right
answer, but for the wrong reason.

The somewhat longer short answer is that Softmax isnâ€™t the right

Some details:

Because youâ€™re minimizing a linear function with respect to linear
constraints, the minimum will occur when the constraints are
saturated.

Specifically, as you have worked out, the minimum occurs when
x_1 = 5.0 and x_2 = 0.0, and the minimum is -1.0377.

doesnâ€™t map your solution back to itself:

5.0 * softmax ([5.0, 0.0]) = [4.9665, 0.0335]


and if you iterate 5.0 * softmax() several times, you get
[4.9641, 0.0359]. The value of your function for this value
of (x_1, x_2) is -1.03678, as returned by your training.

enforcement. (For reasons I wonâ€™t go into, your constraint
enforcement pretty much washes away the changes in x_1
and x_2 made by your optimizer step.)

Why are you getting almost the right answer? Softmax pushes
one value of x_i (the largest) towards 1, and the others towards
0. This is where your constraints are saturated, and therefore where
the minimum lies. That is, in your two-variable case, 5 * Softmax
is pushing x_1 close to 5 (specifically to 4.9641), and x_2 close
to 0.

Even though 5 * Softmax does enforce your constraint (variables
are greater than 0 and sum to 5), it is, in general, not the right way
to enforce such constraints.

The problem (among other things) is that applying Softmax a second
time changes the x_i a second time. If your constraint is already
satisfied, reapplying the constraint shouldnâ€™t changes things.

Related to this, if youâ€™re at the correct minimum ([5, 0]) applying
the constraint moves you away from the minimum (even though
the constraint was properly satisfied). This clearly isnâ€™t what you
want. (Itâ€™s only by luck that because of the specific function you
were minimizing you got almost the right answer using Softmax.)

To enforce your constraint, I would recommend something like:

x_i = (5.0 - sum (x_i)) / n
x_i = max (0.0, x_i)
x_i = (5.0 / sum (x_i)) * x_i


(As an aside, you can also add terms to your loss function that push
the x_i in the direction of satisfying your constraints, and in the
case of your equality constraint, sum (x_i) = 5, you could add
a Langrange-multiplier term to your loss function to enforce the
equality constraint.)

Good luck.

K. Frank

2 Likes

Thank you very much. Thatâ€™s very clear.