Are you sure the second GPU is properly available? What does device_count() returns?
You might be hidding the second GPU with CUDA_VISIBLE_DEVICES=0 as well.
Would you tell me what happened when tensor.to is called? Is the tensor transfered directly from gpu 1 to gpu 2, or it is first transferred from gpu 1 to cpu memory and then transferred from cpu to gpu 2?
It depends on the hardware you have.
If possible, it will be sent directly from one GPU to the other. But not all cards support that AFAIK and it might have to go through the CPU.