Exception handling when using Datapipes

Hi,

Thanks for using TorchData. While timeout is accepted as an argument, I don’t think there is a built-in way within HTTPReader to handle exception in a bespoke way.

If you would like to skip the URL causing problems, you can consider using a .filter prior to HTTPReader to check if it is possible to establish a connection.

I don’t think that will fully address your issue so I think other options are:

  1. Build on top of HTTPReader but overrides its exception handling (probably rewrite __iter__)
  2. Write a new DataPipe that is able to catch exception coming from a source DataPipe
    • I think catching the exception is feasible, I’m less sure about resuming the DataPipe/Iterator after an exception is raised

We would accept a PR for 2 if you have a good implementation. Happy to discuss further.

cc: @ejguan