That sounds surprising; could you share some more details about the setup e.g., are cuda:0 and cuda:1 identical devices? What happens if cuda:1 used as cuda:0 e.g., with CUDA_VISIBLE_DEVICES=1?
Interestingly, two methods that set cuda device lead to different result:
method 1) : set CUDA_VISIBLE_DEVICES=1 before running code x=torch.LongTensor([1,2,3]) ...
method 2): do not set environment variable and run code x=x.to("cuda:1")
Thanks for your suggestion! Could you please provide more details about how to add deviceGuard? Since I did not found anything related to this method in official tutorial, I guess this could be very helpful to point that out in a new version of tutorial