Here, computing gradients is an end in itself, not a means to the end of minimizing some function.

I have a vector `x` whose elements must always sum to 1. I want to know how much `f(x)` changes for valid infinitesimal changes in `x`. That is, how much does `f` vary over small changes in the regime where `x` sums to 1.

Using `.backward()` and traditional gradients naively would consider infinitesimal changes along the coordinate axes in the superspace of `x`. `x+delta` may not be a valid vector anymore, so this strategy cannot be used for my purpose.

So:

How can I measure the gradient taken only in valid directions?

Hi Ryan!

You have a vector, `x`, that satisfies a constraint, `g (x) = 0` (where,
in your case, `g (x) = x.sum() - 1`).

Your constraint defines a hypersurface in â€ś`x`â€ť space (in your specific
case, a hyperplane), and you only want to consider infinitesimal
changes to `x` that lie in this hypersurface.

The gradient of your constraint function, `g`, is perpendicular to your
constraint hypersurface, so you want to subtract off (â€śproject awayâ€ť)
any component of your gradient of `f` that is perpendicular to the
constraint surface (that is, any component that is parallel to the
normal (the vector perpendicular) to your constraint surface).

So you could use pytorchâ€™s autograd and `backward()` to separately
compute `grad_f` and `grad_g`. Then the constrained gradient you
want is:

``````grad_f_constrained = grad_f - ((grad_f.dot (grad_g) / (grad_g**2).sum()) * grad_g
``````

Good luck.

K. Frank

1 Like

This makes sense, Iâ€™ll circle back if I have questions during the implementation. Thanks!