When running faster_rcnn,I found this error,I hope someone can help me!
I just runned epoch0
Could you rerun the script with:
NCCL_DEBUG=INFO TORCH_DISTRIBUTED_DEBUG=DETAIL python script.py args
and post the logs here, please?
Can I chat with you privately?
script.py
and args
are only placeholders and you should replace them with your actual script file name and its arguments.
thanks,I’ve solved it
How did you resolve the issue and what was the root cause?