Hi, experts
- We found 2 strange things when using torch distributed:
OP always run in a separate stream (not current stream), but the code shows it chooses default stream whenasync_op
is false. - 2.Sometimes we found 2 OPs can even run in 2 separate streams (neither in current stream).
Really appreciate your help!