Use joblib to train an ensemble of small models on the same GPU in parallel

“Time sharing” is not a programming approach and the user describes it as a way for e.g. cloud providers to swap between jobs of different users. Of course PyTorch does not implement such a behavior for your workstation.

Divide the cores of GPU into multiple groups and divide the GPU memory accordingly.

I don’t fully understand this claim and guess the user points out that each kernel should not occupy the entire GPU resources to allow concurrent kernels to be executed.

Take a look at this topic and the linked GTC talk which gives you a good overview how the GPU works.