Error: Read timeout and master gpu crashed

install from source:
python setup.py install
and use gloo, get the error:
[/home/zsk/code/pytorch_v0.4.1/third_party/gloo/gloo/transport/ibverbs/buffer.cc:108] Read timeout LID: 5 QPN: 27156 PSN: 14966066
and the master gpu process crashed

but install from binaries, and also use gloo, is ok.
i print info, and show it has block at broadcast (waitRecv)when start pytorch

today, i reinstall with the command:
USE_GLOO_IBVERBS=0 python setup.py build develop
is ok, and i know this way is only support tcp rather than IB

so why get error when i use IB mode

anyone knows? …

I got similar problems. Have you solved this? Thanks.