This thread is to explain and help sort out the situations when an exception happens in a jupyter notebook and a user can’t do anything else without restarting the kernel and re-running the notebook from scratch. This usually happens when
CUDA Out of Memory exception happens, but it can happen with any exception.
The problem comes from ipython, which stores
locals() in the exception’s traceback and thus prevents general and GPU memory from being released.
Currently there are 2 solutions to this problem:
- stripping tb from
locals()before the exception is passed to ipython (preemptive)
- raising a 2nd, simple local exception, like
1/0in the notebook, which resets tb (reactive)
There will be better solutions once ipython sorts this out. The difficulty is to continue supporting the
%debug magic. You can also follow the discussion here.
Please read the guide https://docs.fast.ai/troubleshoot.html#memory-leakage-on-exception which explains the problem in details and provides concrete solutions. You can skip the
fastai-specific section of the guide if it’s not relevant to you and just read the introduction and the custom solutions sections. If after reading the guide you have any questions or difficulties with applying the information please ask the questions in this dedicated thread.
I will update this post once we have a resolution from the ipython dev team (which could take a while).