Consistent result in distributed environment

Hi All,
I am a new user of torch. I would like to develop a dataloader module to feed data to torch C++ API. My focus is the C++ API.
My dataset is distributed among multiple compute nodes. Each node owns a disjoint portion of data WRT any other nodes. Let’s say I have 5 nodes. They own 200,200,200,300,100 examples of data respectively. Is there a way to achieve the same training result from training on a single node (the node owns all 1000 rows of data) or on a 3-node cluster (the data distribution is different from 5-node cluster case)?

Is there a way to control/know how torch generates the indices when calling customDataset::get(index) in training and in scoring? Are the indices sequential (1,2,3,…) or random?

If I deploy multiple workers to load data in a node, is there a way to know the caller’s thread id from customDataset::get()?

auto data_loader = torch::data::make_data_loader(std::move(dataset),

Thank you very much.

Can someone offer any insight please?

cc @VitalyFedyunin @SimonW for dataloader
cc @glaringlee for C++ API