It would help me if I could see how you propose using it in your training loop. The Dataset class is meant to give everything you need for one training instance. It doesn’t make sense to me, without more information, why you would want to pass something else to it (that is different for each item, otherwise you would make it a member of your Dataset and pass it in the constructor.) But I expect you do have a good reason, but without knowing it I can’t work out what the best solution is.
Generally, you are not going to be able to easily pass a custom object to Dataset::getItem as it is used within the machinery of the DataLoader class.
I need to implement it as part of a complex pipeline I have no control on…
When training, at initialisation the Dataset receives a list of folders. getitem works “as usual”: receiving an index, depicting the relevant folders, reading files (image, csvs…), does some data processing and outputs it.
At inference, I cannot do any reading; Instead I want to directly get the data (in the form of a dictionary, which always have the same keys, with their values’ being a numpy array.).
Thanks for the quick response. I thought of this solution, but this will force me to store a lot of data in memory. I wonder if there’s a more elegant solution, e.g. writing a custom data fetcher.