Access custom dataset inside the dataloader loop

Hello,
I have created a custom dataset. Within get(), I recorded the sample index as well as the filename in std::vector.

torch::data::Example<> CustomDataset::get(size_t index) {
    torch::Tensor           tData;
    torch::Tensor           tLabel;
    std::string                tPath;
....
    //set tPath to an image filename. set tLabel as well.

    auto mat = cv::imread(tPath);
...
    tData = torch::from_blob(mat.data, {mat.rows, mat.cols, 3}, torch::kByte);
    tData = tData.to(torch::kFloat);
    tData = tData.permute({2, 0, 1}); // Channels x Height x Width

    path_.push_back(tPath);
    index_.push_back(index);
    return {tData.clone(), tLabel.clone()};
}

Since I can’t change get() signature, I rely on accessing CustomDataset in the data loading loop in order to access members of CustomDataset. In below example, I set the batch size to 10. During runtime, batch.data.size(0) is 10. However customDataset.path_.size() is 0.

    auto customDataset = CustomDataset();
    auto dataLoader = torch::data::make_data_loader<torch::data::samplers::SequentialSampler>(
            customDataset//std::move(customDataset)
                    .map(torch::data::transforms::Normalize<>({0.5, 0.5, 0.5},{0.5,0.5,0.5}))
                    .map(torch::data::transforms::Stack<>()),
            torch::data::DataLoaderOptions()
                    .batch_size(10)
                    .workers(actionParameters.nThreads)
                    .drop_last(true));

    for (auto& batch : *dataLoader) {
        std::cout << "score::runAction(): customDataset.path_.size() = " << customDataset.path_.size() << std::endl;
        std::cout << "score::runAction(): batch.data.size(0) = " << batch.data.size(0) << std::endl;
...
        customDataset.path_.clear();
        customDataset.index_.clear();
    }

My impression is that dataloader created a copy of customDataset after make_data_loader(). The original customDataset object is no longer needed. Thus std::move() semantics is more efficient.

I saw this post in Python API discussing dataloader reset the state of dataset.


This post has @ptrblck suggested solution using loader.dataset to access the embedded dataset.

for epoch in range(2):
    for idx, data in enumerate(loader):
        print('Epoch {}, idx {}, data.shape {}'.format(epoch, idx, data.shape))
        
    if epoch==0:
        loader.dataset.set_use_cache(True)

I wonder what is the equivalent in the C++ world. How can I access the associated custom dataset after make_data_loader()?
Thank you very much.