I am confused about the Iterator class of DataLoader. In particular I wanted to ask if the implementation has fundamentally changed between some of the pytorch versions?
Since, in the online documentation
I can only find the classes _BaseDataLoaderIter(object) and its subclasses _SingleProcessDataLoaderIter(_BaseDataLoaderIter) and _MultiProcessingDataLoaderIter(_BaseDataLoaderIter). However, when I look at the anaconda code on my PC in anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py
these three classes do not exist and instead there is only a class called _DataLoaderIter(object) (which seems somewhat similar to the implementation of _MultiProcessingDataLoaderIter, but they re not exactly the same).
Why cant I find the code of _DataLoaderIter(object) in the documentation? Does it have to do with different pytorch versions? If so, what consequences does that have if I use custom dataset, sampler and collate_fn functions? Will they work for either pytorch version?
All the classes that start with an underscore like _Foo are internal and so are not documented and can change between versions without notice.
The latest big change there I can think of is: https://github.com/pytorch/pytorch/pull/19228
Which version of pytorch do you currently have installed?
Hi
thank you for your reply!
I am currently using Version ‘1.0.1.post2’.
In your link indeed is explained that DataLoaderIter was split up into the two classes I have mentioned above. Does this mean that if I now implement a custom collate_fn, sampler and dataset, that these might not work on a newer pytorch version anymore?
Well I try to make the getitem method expecting two indices as parameters such that I can select data from a 3D tensor. To do so, I at least have to rewrite the collate_fn method, too. Maybe even more, but I havent got so far yet.
Thanks a lot for your replies both of you.
I have just realized that in both implementations, the getitem method is always assumed to only take one argument. Since in __DataLoaderIter class there is line 615:
batch = self.collate_fn([self.dataset[i] for i in indices])
and in the other case, when _MapDatasetFetcher is used, there is the line:
data = [self.dataset[idx] for idx in possibly_batched_index]
Thus, both implementations demand that the getitem method necessarily only takes one argument. And since both of the above are internal methods I guess I should not be changing.
So is there really no option that I adjust the getitem method to accept two indices and hence make use of my 3D dataset?