I was trying to train MobileNet on multi gpus using Pytorch. From watch nvidia-smi
, I see that GPUs are sometimes working, and sometimes not working(GPU util is 0%). This slows down training speed a lot.
But training MobileNet on a single GPU and training ResNet50 on multi GPUs do not have such issue. I was wondering what is going wrong. Is there someone used to meet this problem?
PS:
- Pytorch version is 0.4.0
- I read all training data into memory.
- I have also tried Keras, it does not have such issue.