Apologies, y was used as a generic place holder for an intermediate layer output (I probably should have used layer or some other variable name).
The extension to this question is applying different lambdas in the hook (to zero out some masked gradient values for instance). I will try the method you suggested, thanks!
From what I understand, x.retain_grad() should be preferred over using the register hook method. See this for a comparison of various approaches to saving gradients of intermediate variables: