Is it possible to do the following (in pytorch 1.x or 2) during training to modify what happens before/after each of the following:
-
During the forward pass, after the activations are store (to be later used in the backward pass), to do operations on them (for example, like transfering them to another device and other things)
-
During the backward pass, before every layer gradients calculations to do some operations on the activations (for example, like transfering them to another device and other things)
Any information will be highly appreciated
[EDIT: I’m exploring usage of functions like register_module_full_backward_pre_hook]