DataLoader, when num_worker >0, there is bug

piojanu · February 2, 2019, 2:06pm

Yea, I’ve explored topic a bit and what I found is:

With version 1.8 of HDF5 library working with HDF5 files and multiprocessing is a lot messier (not h5py! I mean HDF5 library installed on your system: https://unix.stackexchange.com/questions/287974/how-to-check-if-hdf5-is-installed). I highly recommend to update the library to 1.10 version where multiprocessing works better. I was only able to get h5py to work with “with” statement and this seems to give huge overhead, but I didn’t have time to investigate it properly:

class H5Dataset(Dataset):
    def __init__(self, h5_path):
        self.h5_path = h5_path

    def __getitem__(self, index):
        with h5py.File(self.h5_path, 'r') as file:
            # Do something with file and return data

    def __len__(self):
        with h5py.File(self.h5_path,'r') as file:
            return len(file["dataset"])

In version 1.10 of HDF5 library I was able to create h5py.File once in __getitem__ and reuse it without errors.

class H5Dataset(Dataset):
    def __init__(self, h5_path):
        self.h5_path = h5_path
        self.file = None

    def __getitem__(self, index):
        if self.file is None:
            self.file = h5py.File(self.h5_path, 'r')
        # Do something with file and return data

    def __len__(self):
        with h5py.File(self.h5_path,'r') as file:
            return len(file["dataset"])