I have a clear understanding how to load a PyTorch model from a .pth.tar
file to a single prescribed GPU, but in my situation, the entire model doesn’t fit on a single GPU. So I need to load a pipeline-parallelized model (from a .pth.tar
file) to several GPUs at once.
Is it possible in PyTorch without involvement of additional software? If yes, how to implement this?
Looking for a response.
Thanks.