CPU/GPU agnostic. Why using .to() instead of the simple CUDA_VISIBLE_DEVICES?

Hey guys,

I’m pretty new at Pytorch. I’m coming from Tensorflow.

I want my code to be device agnostic.

I am following this tutorial: https://pytorch.org/docs/stable/notes/cuda.html

Basically, using .to(device) everywhere.

However, I realized that I can just:

  • Run on CPU with export CUDA_VISIBLE_DEVICES=;
  • Run on GPU with export CUDA_VISIBLE_DEVICES=0;

Exactly like in Tensorflow where the code for CPU/GPU is -roughly- the same.

Why do people bother with .to(device) or .type()?

I may understand why if we deal with multiple GPUs and want to have a finer-grained management of the GPUs.

But in the simple case, doesn’t the code get bloated and less readable?


If you don’t use to() to push your data and models to a GPU, all tensors will be created on the CPU by default.
Setting CUDA_VISIBLE_DEVICES won’t change this behavior, so you would still need to somehow declare which tensor should be pushed to the GPU and which should stay on the CPU.

You could try to set the default tensor type to a CUDA one, if a GPU was found, but that feels like an unnecessary hack, since now all calculations will be performed on the GPU by default.
Asynchronous CUDA calls with some CPU computation will hardly be possible.

Could you explain your idea a bit more, as I might misunderstand it?

My bad you are right! I thought for a couple of seconds that we could do the same as in tensorflow.

Basically, writing code without mentioning anything about device/cuda and still being able to run it on either a GPU or CPU by just setting CUDA_VISIBLE_DEVICES.

Thanks for your help!