In order to save memory, is there a way to store all activations of a forward pass in CPU, and bring them back to GPU before backward (instead of recomputing them)? What is the best way to go about this?
In order to save memory, is there a way to store all activations of a forward pass in CPU, and bring them back to GPU before backward (instead of recomputing them)? What is the best way to go about this?