Hello,
I am currently trying to load the CC100 dataset with torchtext.
I am using the following line to load it:
dataset = CC100(root="data", language_code='en')
and it runs and returns an object of the type “ShardingFilterIterDataPipe”. I have googled a lot, queried github and read tons of documentations, but I can not figure how to correctly load the dataset and access its elements.