Hi, I have a model that I have to do a lot of Data Augmentations in 3D that is time-consuming. I want to do it in parallel. How can I do it?
I read the other topics in the forum but I didn’t find the answer to my question.
Another question that I have I need to do some intensity augmentations like changing contrast, hue, and saturation for 3d images. I can not use the command that we have in PyTorch. So, if anyone knows any package that I can use for NumPy n-array that helps to change pixel intensity really helps me.
If you are using a DataLoader with num_workers>0, multiple workers will be used to create each batch in the background.
I don’t know which lib would provide all functionality you need for the 3D data augmentation, but maybe a medical lib could be helpful, such as MONAI or torchio.
Thanks for your answer. So, you mean that when I set num_workers=4, then each batch is divided by 4 and augmentations are done on this 4 subsets in parallel?
Regarding MONAI, and torchio I checked them but I didn’t find any pixel intensity transformations.
Thank you for your help.
No, each worker will create an individual batch.
So 4 workers will create 4 batches simultaneously, add these batches to a queue and try to prefetch the next one.
There was some effort to use multiple workers for a single batch creation, but I don’t know the status of it.