Torch shm manager blowing up my computer

I am training CNNs, which are a few layers deep. My laptop has ten cpu cores, verified using os.cpu_count(), and I’m using ten workers for my dataloader which could be the issue.

I run my training through the command line with
python train.py

When i hit ctrl+z on the command line, to halt a training run, there were approx three thousand “torch shm manager” processes on the activity monitor on my mac, each consuming 0.5% CPU. With great lag, I managed to force quit all of them. When I use a keyboard interrupt, ctrl+c, to exit the training, this issue doesn’t occur.

Last night and the night before, i wasn’t able to fight the lag, causing my m1 pro mac display to freeze followed by a shut down. Is there a way for me to use ctrl+z and not blow up my computer?

1 Like

I have the same problem with m3 and running a pytorch code with vscode on it. Did you find any solution?