Load training data from one single huge file

Hi, all,
I have problems regarding reading large training data from disk.
The situation is I have one single huge file (roughly 100G), each line is a training example.
Clearly I can not load them all into memory.
When I searched for the solution.
One possible solution is to split this huge file into small files and use DataSet and DataLoader like mentioned in here: Loading huge data functionality

However, in this way, the total number(the return value of __len__ in DataSet) of training example becomes the number of small files not the true number of my training examples. DataLoader sees each small file as one example which is weird.
Can any one provide an elegant solution for this situation? Perhaps producer/consumer style?

Are there any solutions for this? Thanks.

Same issue here. Any ideas?

@marchss @jetcai1900 Not yet. Finally, I implemented the producer/consumer loader by myself.

Are you willing to share a short example with us? Thanks