Modifying forward/backward pass

yoelshoshan · January 5, 2023, 6:40pm

Is it possible to do the following (in pytorch 1.x or 2) during training to modify what happens before/after each of the following:

During the forward pass, after the activations are store (to be later used in the backward pass), to do operations on them (for example, like transfering them to another device and other things)
During the backward pass, before every layer gradients calculations to do some operations on the activations (for example, like transfering them to another device and other things)

Any information will be highly appreciated

[EDIT: I’m exploring usage of functions like register_module_full_backward_pre_hook]

ptrblck · January 6, 2023, 12:48am

Your use case sounds similar to CPU offloading, which uses torch.autograd.graph.saved_tensors_hooks or torch.autograd.graph.save_on_cpu if I’m not mistaken, so you could take a look at these context managers.

yoelshoshan · January 6, 2023, 7:15am

Thanks a lot!! Sounds like exactly what I need