While using datapipes (torchdata 0.5.1) I ran into the following:
S3FileLoader feeds into a ZipArchiveLoader. There is an exception in line 71 of ZipArchiveLoader because children_counter is missing from the parent stream (the one that the S3 loader provides) If I add (before try/except)
Hello, this is the code that reproduces the problem
for url, stream in S3FileLoader(IterableWrapper([f's3://{self.__bucket}/{self.__dataset}/zip/{archive}'])):
if archive.lower().endswith('.zip'):
for name, doc in ZipArchiveLoader(IterableWrapper([('XXXX', stream)])):
yield name, doc