Hi,
I’m trying to use OpenX API for loading a dataset:
dataset = OpenXExperienceReplay(
"berkeley_gnm_cory_hall",
download='force',
streaming=False,
root=ds_root
)
But it fails on any dataset from this collection, similarly, with this error:
File "~/.cache/huggingface/modules/datasets_modules/datasets/jxu124--OpenX-Embodiment/317e9044a9bb97bb1db9ea5aebf1c15f5cc3e1e071e5da025e97892e96dae22b/OpenX-Embodiment.py", line 29, in decode_image
data = data.decode()
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
...
File "...", line 130, in main
dataset = OpenXExperienceReplay(
File ".../python3.10/site-packages/torchrl/data/datasets/openx.py", line 358, in __init__
storage = self._download_and_preproc()
File ".../python3.10/site-packages/torchrl/data/datasets/openx.py", line 484, in _download_and_preproc
dataset = datasets.load_dataset(
File ".../python3.10/site-packages/datasets/load.py", line 2096, in load_dataset
builder_instance.download_and_prepare(
File ".../python3.10/site-packages/datasets/builder.py", line 924, in download_and_prepare
self._download_and_prepare(
File ".../python3.10/site-packages/datasets/builder.py", line 1647, in _download_and_prepare
super()._download_and_prepare(
File ".../python3.10/site-packages/datasets/builder.py", line 999, in _download_and_prepare
self._prepare_split(split_generator, **prepare_split_kwargs)
File ".../python3.10/site-packages/datasets/builder.py", line 1485, in _prepare_split
for job_id, done, content in self._prepare_split_single(
File ".../python3.10/site-packages/datasets/builder.py", line 1642, in _prepare_split_single
raise DatasetGenerationError("An error occurred while generating the dataset") from e
datasets.exceptions.DatasetGenerationError: An error occurred while generating the dataset
Has anyone had a successful experience with this API? Seems like it still needs some work.