Computing constrained gradients

Here, computing gradients is an end in itself, not a means to the end of minimizing some function.

I have a vector x whose elements must always sum to 1. I want to know how much f(x) changes for valid infinitesimal changes in x. That is, how much does f vary over small changes in the regime where x sums to 1.

Using .backward() and traditional gradients naively would consider infinitesimal changes along the coordinate axes in the superspace of x. x+delta may not be a valid vector anymore, so this strategy cannot be used for my purpose.


How can I measure the gradient taken only in valid directions?

Hi Ryan!

You have a vector, x, that satisfies a constraint, g (x) = 0 (where,
in your case, g (x) = x.sum() - 1).

Your constraint defines a hypersurface in “x” space (in your specific
case, a hyperplane), and you only want to consider infinitesimal
changes to x that lie in this hypersurface.

The gradient of your constraint function, g, is perpendicular to your
constraint hypersurface, so you want to subtract off (“project away”)
any component of your gradient of f that is perpendicular to the
constraint surface (that is, any component that is parallel to the
normal (the vector perpendicular) to your constraint surface).

So you could use pytorch’s autograd and backward() to separately
compute grad_f and grad_g. Then the constrained gradient you
want is:

grad_f_constrained = grad_f - (( (grad_g) / (grad_g**2).sum()) * grad_g

Good luck.

K. Frank

1 Like

This makes sense, I’ll circle back if I have questions during the implementation. Thanks!