Hello! I run the codes to train a model in one machine multi cards using torch.nn.DataParallel. The codes specify gpu of input data as device 0 when loading data. The codes are written like:
model = torch.nn.DataParallel(model).cuda()
device = args.gpu if args.gpu is not None else 0
for i, sample in enumerate(loader):
video, audio, index = sample['frames'], sample['audio'], sample['index']
video = video.cuda(device, non_blocking=True)
audio = audio.cuda(device, non_blocking=True)
index = index.cuda(device, non_blocking=True)
output = model(video, audio, index)
For ease of explantation, I have omitted unimportant details. Now I try to add a text modality without specifying the GPU device. The codes are written as follow:
model = torch.nn.DataParallel(model).cuda()
device = args.gpu if args.gpu is not None else 0
for i, sample in enumerate(loader):
video, audio, text, index = sample['frames'], sample['audio'], sample['text'], sample['index']
video = video.cuda(device, non_blocking=True)
audio = audio.cuda(device, non_blocking=True)
index = index.cuda(device, non_blocking=True)
text = text.cuda()
output = model(video, audio, text, index)
I wonder is this correct? Why does data need to be on device 0 when using torch.nn.DataParallel? Thanks.