I use a hook to store gradient information when I test, but this will cause CUDA out of memory, how do I release the information in the hook?
I assume storing the gradient information increases the memory usage and you are thus running OOM. You can delete tensors via del tensor
.
1 Like