Data Loader was killed

If I set the num_workers to be 8 (or 4), then after hundreds of iterations, I got error “RuntimeError: DataLoader worker (pid 56901) is killed by signal: Killed.”

I searched related topics, some suggested “num_workers=0”. So I set it to be 0 but after hundreds of iterations, I still got killed. This time, only “Killed” is given. No other hints.

The top output and the training output are as follows:

Your data loading code is killed by SIGKILL, which is a very fatal signal. You should check and see how it can trigger such a signal.

Solved. Seems I have a bunch of tensors which need requires_grad to be False while I set it to be True.

For a reference, one update of the previous error I got: “RuntimeError: Torch: not enough memory: you tried to allocate 1GB. Buy new RAM!”