Just like pytorch provides option like nn.DataParallel() to efficiently make use of multiple-GPUs, is there any such option for general purpose tensor operations that might not require mutli-GPUs but rather multi-processing ?
Like if i have a function like:
> def generate_dataset(dataloader, val=False):
> f = open('new_dataset.txt', 'w')
> for x,y in tqdm(dataloader, total=len(dataloader)):
> pred = []
> for x_d in x:
> pred.append(general_tensor_operations(x_d)) #THIS IS'NT NECESSARILY A nn.Module()
> data = format_to_string_data([pred, y.item()])
> f.write(data)
> f.close()
Note that,
- I am making use of a torch dataloader object
- the tensor operation on x is NOT performed by a nn.Module() model
- the tensor operation doesn’t work on batch data, rather on each slice of the mini-batch
- and i write the obtained (pred,y) to a new txt file
Now the processing bottle neck occurs at the general_tensor_operations(). So if I am to parallelize such a function (generate_dataset()), how do i do it ? Note that, if i am using multiple processes, then they should use non-overlapping subsets of the original dataloader (otherwise there will be duplicates in the new_dataset.txt). A code snippet shall help a ton.
Thank you in advance!