When is torch_tensor.to(device) necessary?

robcamp-code · April 30, 2022, 2:33pm

In a lot of tutorials online, you will see the line device = torch.device('cuda' if torch.cuda.is_available() else 'cpu'). Throughout any given deep learning model, there are several examples of doing tensor_a.to(device) and tensor.detatch(). When is it necessary to add a tensor to the cpu and when is it necessary to detach a tensor from the cpu/gpu, and are there any resources where I can read on this topic.

ptrblck · April 30, 2022, 10:33pm

The tensor = tensor.to(device) operation is used to move this tensor to the corresponding device and is needed e.g. if you want to execute the operations on tensor on this device (e.g. the GPU).
tensor.detach() is used to cut the computation graph and make sure that Autograd won’t backpropagate to the previous operations. This is used e.g. if you want to store a specific tensor without its entire computation graph (in e.g. loss stats), in GAN training (e.g. the generator’s output is detached during the training of the discriminator with the real labels).