At the end of an intermediate layer in the forward pass, I want to store a modified version of the output of this layer instead of the original one. Then in the backward pass, I would like to use the modified version to compute the gradient for the next layer. Let’s assume that the operation of that intermediate layer (ReLU, conv. …) is unchanged.
My questions are
- How to ask PyTorch to save the modified tensor and detach the original one from the graph?
- If PyTorch allows to do so, how will the GPU allocation be affected? Will it be different by the amount of the difference between the original and modified tensors?
Help is much appreciated!