Hi, guys , I’m occur a error which is “ERROR: Unexpected bus error encountered in worker. This might be caused by insufficient shared memory (shm)”
I run my code in local machine , not in docker !! not in docker !! not in docker!
GPU : GTX 1080ti with 12G memory
RAM : 32G
CPU: intel 9700K
Pytorch version : 0.4.1 or 1.0.1 all test
python version: 3.5
1) torch.utils.data.Dataset class
2) torch.utils.data.DataLoader class and num_works=12
When I train my code with those config, I have never seen this error before .
my dataset is many jpg format image, So I use Dataset.getitem() implement some function such as , load image data from file into numpy.array by using opencv-python and convert it to Tensor .
When num_works=0 , this pipeline is very slow and only one thread of cpu for working.
When num_works=12, all threads of cpu for working it is very good , but this pipeline is still slow for first two epochs and fast for next epochs, and my RAM nearly full , I think it should load data into RAM first, then train them.
But now it out many error as this Topic,
iter_t it mean how many seconds for one step, it very very slow, Usually it is smaller than 1
- how I fix this error
- how to speed up dataload pipeline ??? ( I have already using data.Dataset and data.DataLoader API with num_workers=12)
my code is here:
because I’m newbee, so I can only upload one picture…