I am using nccl backend to send and recv pytorch tensor on my server.
And I am trying to use these command to check the bandwidth usage, is it corrent?
Fisrt I use
'tcp://127.0.0.1:1224'
as my dist utils.
And then I print
export NCCL_SOCKET_IFNAME=lo
lo is the network interface used for loop.
Then I use
sudo iftop -i lo -n -P
But the bandwidth is too small.
WHat should I do to check the bandwidth uses of dist.send and dist.recv?