I have simply implemented DataParallel technique to utilize multiple GPUs on single machine. I am getting an error in fit function
https://github.com/mindee/doctr/blob/main/references/recognition/train_pytorch.py
from fastprogress.fastprogress import master_bar, progress_bar
In fit_one_epoch function:
for images, targets in progress_bar(train_loader, parent=mb):
images = images.to(device)
targets = targets.to(device)
In main func:
model = model.to(device)
if device == 'cuda':
model = nn.DataParallel(model)
# model = model.to(device)
cudnn.benchmark = True
Traceback
Traceback (most recent call last):
File "/home2/coremax/Documents/doctr/references/recognition/DP_KR.py", line 481, in <module>
main(args)
File "/home2/coremax/Documents/doctr/references/recognition/DP_KR.py", line 390, in main
fit_one_epoch(model, train_loader, batch_transforms, optimizer, scheduler, mb, amp=args.amp)
File "/home2/coremax/Documents/doctr/references/recognition/DP_KR.py", line 122, in fit_one_epoch
targets = targets.to(device)
AttributeError: 'list' object has no attribute 'to
Based on the error message it seems your targets are passed as a list from the DataLoader. I don’t understand how nn.DataParallel is related to it as the data loading logic shouldn’t change. In any case, could you describe how you are loading the data and targets in your Dataset.__getitem__?
You are right! this is docTR library and they are using different logic for a single GPU. Due to the huge amount of training data, I have to utilize multiple data. targets variable is problem for me
Could you point me to the code which shows a different usage for multi-GPU use cases in the Dataset, please? I still don’t understand how this could be the case since the Dataset is not aware if you are using nn.DataParallel or not.
I did not find multi-GPU usage in the Dataset Do you mean Data Parallel code? I am using simple example to implement DataParallel in my code like below link
# GPU
if isinstance(args.device, int):
if not torch.cuda.is_available():
raise AssertionError("PyTorch cannot access your GPU. Please investigate!")
if args.device >= torch.cuda.device_count():
raise ValueError("Invalid device index")
# Silent default switch to GPU if available
elif torch.cuda.is_available():
args.device = 0
else:
logging.warning("No accessible GPU, targe device set to CPU.")
if torch.cuda.is_available():
torch.cuda.set_device(args.device)
model = model.cuda()