Thanks for reply. In my situation, the database is very large and I could not save all the augmented data on the disk. I came up with a new idea yesterday that is using multi-process:
loaddataParallel(dataTransformed, imgPaths, ldmkPaths) # First do a data augmentation based on image and landmark paths, save the transformed data inside 'dataTransformed'
training_data_loader = DataLoader(dataset=dataTransformed, num_workers=opt.threads, batch_size=opt.batchSize,shuffle=True) # Put the transformed (augmented) data into dataloader
for dataBlockIdx in range(1,30): # Read data from different folders
processes = 
imgPaths = glob.glob(Folders[dataBlockIdx] + '/img/*.*g')
ldmkPaths = glob.glob(Folder[dataBlockIdx] + '/img/*.mat')
dataAugmentation = mp.Process(target=loaddataParallel, args=(dataTransformed, imgPaths, ldmkPaths)) # Create a process called data augmentation
dataAugmentation.start() # Start the process
trainNetParallel(epochIdx, unet, training_data_loader) # At the same time, train the network
for p in processes:
p.join() # Wait until the 'data augmentation' process end
training_data_loader = DataLoader(dataset=dataTransformed, num_workers=opt.threads, batch_size=opt.batchSize,shuffle=True) # Put the transformed (augmented) data into the dataLoader (new)
The data augmentation and network training can be done simultaneously now. I did not put the ‘trainNetParallel’ into the parallel pool, it may trigger a ‘CUDA re-initialization’ error in Python 2.7. Without putting it into the pool, the program can still running.
Indeed, use several threads to perform data loading and augmentation can also help to speeds-up the process, it can be done inside ‘loaddataParallel’.
Here comes another question, in the default implementation of Pytorch, is ‘getitem’ in ‘torch.utils.data.Dataset’ runs in parallel ? All the data transformation are accomplished here.