While training the model, for the beginning several epochs, all is ok. But aftering running for several epochs, the thread is killed. The error shows ‘Dataloader workers is killed by a signal: bus error’ .
However, there is definitely enough GPU memory showed by ‘nivida-smi -l’.
Besides, I need to do many data processing, including open several images and resize them or embedding some data. This error is often appear after finishing validation. I don’t know the real reason. Really need your help.
Thanks in advance.