Recommendations for Multi-GPU training options

I have completed many projects with PyTorch using my own desktop computer, with a good GPU/CPU setup. I’m interested in taking things to the next level, but I have no experience in this matter. If anyone has thoughts/recommendations for resources to explore:

  1. Training with multiple GPUs on some service like AWS with the use of MY OWN pre-defined directory structure with dataset files (images, annotations, etc.)

  2. Inference on a server after training

  3. Thoughts about which versions of PyTorch and other Python packages like Numpy, PIL, etc. I would be able to access with such a proposed setup

Any thoughts/insights/resources much appreciated.