Distributed Sampler in C++ frontend

Hello,
I am trying to modify this example C++ code to incorporate distributed learning across multiple processors. For distributed sampling of the dataset, torch.utils.data.distributed.DistributedSampler is usually used in Python. However, I am not sure what is its equivalent in the C++ frontend. I believe I found it in the docs here but I am getting errors in the following code snippet:

auto trainset = torch::data::datasets::MNIST("./data")
                  .map(torch::data::transforms::Normalize<>(0.1307, 0.3081))
                  .map(torch::data::transforms::Stack<>());
auto train_sampler = torch::data::samplers::DistributedSampler(trainset.size().value(), numranks, rank, false);
auto train_loader = torch::data::make_data_loader<torch::data::samplers::DistributedSampler>(std::move(trainset), batch_size);

Can anyone point out what is the right way to use it ?
Thanks.