DDP w/ multi-CPU: Sending different layers to different devices

It looks like DDP for CPUs has been deprecated, so I assume that distributing training across multiple CPUs is virtually same as demonstrated in this tutorial, however I’m confused how to specify to send a model to a specific rank CPU.

That is, in the definition of ToyMpModel in the above tutorial, different layers of the model are sent to different GPUs via the .to() method. In this code, the .to() method simply takes an integer as argument. When one only has multiple CPUs and no GPUs, how does .to() know to send the layer to the rank CPU specified by the input to .to()?

When I try to input different integers to .to() on my local machine (which has a gpu), .to() automatically assumes that I’m sending a model/layer to a CUDA device - even when I specify export CUDA_VISIBLE_DEVICES="" before I run my code.