as above. I killed my program but when run it another time, then came across the error of out-of memory.( batsize is 64 for resnet50)
so what is wrong?
Sometimes the same issue occured, when I killed a script with multiple
DataLoader workers with CTRL+C.
It seems to me, killing the process like this, sometimes does not kill all child processes, so that these processes still use the GPU memory.
If no other python scripts are running on your machine, try killing them with
killall python in your terminal.
This usually works for me.
yes, I found many same problems just now. unfortunately, there is another PID running. I will look into it first. if there is no other approaches, then I will try your method. thank you!