Hi, I have a model as such:
model = MyNetwork()
if self.hparams.nb_gpus > 1:
model = nn.DataParallel(model, device_ids=[0, 1, 2, 3])
device = torch.device("cuda:0")
model.to(device)
for data in loader():
gpu_data = data.to(device)
# ------------> HANGS IN THE MODEL CALL...
out = model(gpu_data)
Q1: If I don’t add the nn.DataParallel call, this works just fine on 1 GPU. Any ideas? I’m also running this on a compute cluster (HPC) managed by SLURM. I’m reserving a full node which has 4 gpus on it (1080tis)…
Q2: do i need multiple device calls for multiple gpus? ie:
device_a = torch.device("cuda:0")
device_b = torch.device("cuda:1")
device_c = torch.device("cuda:2")
device_d = torch.device("cuda:3")