Versions:
torch: 1.6.0
torchvision: 0.7.0
Created a custom dataloader to apply custom augmentations on the image and transform it to a tensor. Even without enabling custom augmentations and setting num_workers=0 in the Dataloader object (used default way as well by not setting anything), it is still consuming around 15-20 threads (not consuming the whole thread but 9% of it). I am not sure what is causing this issue as I want to keep the resources free to be used by compute intensive processes of other users.
Tried:
- Setting OMP and other env variables to restrict the use of threads, using os.environ module in Python
- Tried setting combinations of num_workers and pin_memory params
- Optimized the code to avoid copying ops to CPU and using numpy for calculation of metrics there.
- I had installed torch using conda at the time of env creation but installed torch using pip due to some issues later. Can that cause an issue?
@ptrblck: Can you help me out in understanding as to what is actually causing this or does the API has some internal operations which are segregated over the threads, by default?!