For ad-hoc experimentation with backward not quite equal to forward, the pattern
y = x_backward + (x_forward - x_backward).detach()
works quite well. It get’s you x_forward
in the forward, but the derivative will act as if you had x_backward
. Stay clear of NaN
(and infinity - infinity
which is NaN). It’s a tad more expensive than a custom autograd.Function, probably.
Best regards
Thomas