Model Parallelism and NVIDIA NVLINK

Hector_Corrales · December 19, 2018, 12:54pm

As I understand, in model parallelism, you divide a model and train each part separately in different GPUs.
An example code I’ve found is this:

class Network(nn.Module):
def init(self, split_gpus):
self.module1 = (some layers)
self.module2 = (some layers)

    self.split_gpus = split_gpus
    if self.split_gpus:
        self.module1.device("cuda:0")
        self.module2.device("cuda:1")

def forward(self, x):
    x = self.module1(x)
    if self.split_gpus:
        x = x.device("cuda:1") 
    return self.module2(x)

My question is, is there a way to join 2 GPUs to be seen as a single GPU with double memory and not having to split the model?
Does NVIDIA NVLINK do this?
If not, what does NVLINK do?

Thanks!

samster25 · February 28, 2019, 2:26am

It sounds like what you want is Data Parallelism. Where you have the model replicated on each GPU and they each work on different data. This works if you can do a batch size > 1 on a single one of your GPUs.

NVLINK provides a faster interconnect compared to PCIe. This allows faster GPU<>GPU communication for either data or model parallelism.

Rubeen_Mohammad · April 6, 2020, 6:02am

Hi, I’m very new to Nvlink and before starting up, do we need write any code to enable the usage of nvlink for model parallelism. Could you please give me any example code for running a model on nvlink, if you have any.
Thank you