Is there anyway to monitor the length of buffer of shuffler/prefetcher datapipe?

I’m benchmarking my data pipeline design. And I would like to know more information about the data pipeline at runtime.

For example, I would like to know the occupancy status of the prefetch buffer, to see if the buffer size setting is too large or small.

Or in another scenario, I want to benchmark the speed of a data pipeline but not consider the time to fill up the shuffle buffer. (Currently, I sleep the main thread for a while before trying to get data from the data iterator)

Really appreciate it if someone can shed some light on this!

We currently do not have that functionality; though we are tracking that potential feature as a GitHub issue.

Feel free to comment on that issue to show your interest/use case and add any suggestion that you may have. We are also open to accept a PR that implement that feature.