Hi everybody,
I use two models and initialized them like that:
if torch.cuda.device_count() > 1:
model_cnn = nn.DataParallel(model_cnn)
model_fc = nn.DataParallel(model_fc)
model_cnn = model_cnn.to(device)
model_fc = model_fc.to(device)
I trained my model like that:
for epoch in range(num_epochs):
iter_loader_source = iter(source_loader)
for _ in range(len(iter_loader_source)):
batch_data_source, labels_source = iter_loader_source.next()
batch_data_source = batch_data_source.to(device)
labels_source = labels_source.to(device)
x_fc1_source = self.model_cnn(batch_data_source.float())
x_fc3_source = self.model_fc(x_fc1_source)
optimizer1.zero_grad()
loss.backward()
optimizer1.step()
the loss is a combination of an MMD and CE loss. MMD loss is defined by myself and CE-loss is used from torch.nn. Both losses were also transfered to GPU:
criterion = torch.nn.CrossEntropyLoss().to(device)
MMD_loss_calculator = MMD_loss_calculator.to(device)
The device is defined like that:
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
I use a batch size of 32. When i print the inpit size of the CNN I receive the following:
In Model: input size torch.Size([8, 1, 1024])
In Model: input size torch.Size([8, 1, 1024])
In Model: input size torch.Size([8, 1, 1024])
In Model: input size torch.Size([8, 1, 1024])
It therefore seems like the models are on 4 different gpus and the data is split on those 4 gpus equally.
When i check the GPU utilization with “nvidia-smi --query-gpu=utilization.gpu --format=csv -l 5” I get the following:
84 %
1 %
1 %
2 %
It seems like just GPU 0 works properly. How can that be? How can I use all 4 GPUs equally? How is the data split between the GPUs (randomly, or in data[: batch],data[: batch:batch*],data[batch2:batch3] and data[batch*3:] ?