DataLoader "Too many open files" - When no files should be open!

Yoni_Keren · January 14, 2018, 10:24am

Hi!

So this is a minimal code which illustrates the issue:

This is the Dataset:

class IceShipDataset(Dataset):
    BAND1='band_1'
    BAND2='band_2'
    IMAGE='image'
    
    @staticmethod
    def get_band_img(sample,band):
        pic_size=75
        img=np.array(sample[band])
        img.resize(pic_size,pic_size)
        return img

    def __init__(self,data,transform=None):
        self.data=data
        self.transform=transform
    
    def __len__(self):
        return len(self.data)  

    def __getitem__(self, idx):

        sample=self.data[idx]
        band1_img=IceShipDataset.get_band_img(sample,self.BAND1)
        band2_img=IceShipDataset.get_band_img(sample,self.BAND2)
        img=np.stack([band1_img,band2_img],2)
        sample[self.IMAGE]=img
        if self.transform is not None:
                sample=self.transform(sample)
        return sample

And this is the code which fails:

PLAY_BATCH_SIZE=4
#load data. There are 1604 examples.
with open('train.json','r') as f:
        data=f.read()
data=json.loads(data)

ds=IceShipDataset(data)
playloader = torch.utils.data.DataLoader(ds, batch_size=PLAY_BATCH_SIZE,
                                          shuffle=False, num_workers=4)
for i,data in enumerate(playloader):
        print(i)
        pass

It gives that weird open files error in the for loop…
And this is the state of my machine when getting that weird error:

yoni@yoni-Lenovo-Z710:~$ lsof | wc -l
89114
yoni@yoni-Lenovo-Z710:~$ cat /proc/sys/fs/file-max
791958

My torch version is 0.3.0.post4

If you want the json file, it is available at Kaggle (https://www.kaggle.com/c/statoil-iceberg-classifier-challenge)

What am I doing wrong here?!

SimonW · January 15, 2018, 2:34am

Supposing it’s a unix based system, then everything is a file. Other proc might be using a lot of fds. Try increasing your file num limit or restart.

Yoni_Keren · January 15, 2018, 7:40am

While it is Ubuntu, that isn’t the issue at all, and this isn’t the first time I’m using DataLoader…

Oh the following shows this isn’t the issue(unless DataLoader opens over 800000 files, in which case i would like to know why ):

yoni@yoni-Lenovo-Z710:~$ lsof | wc -l
89114
yoni@yoni-Lenovo-Z710:~$ cat /proc/sys/fs/file-max
791958

oscar · March 23, 2018, 1:21am

hi guys,

were you able to get a resolution here? I’m seeing the same error. I’m using torch==0.3.1 and I’m on Ubuntu 16.04. I have other working data-loaders that doesn’t see this problem.

TsuanYuan · September 28, 2018, 5:23pm

I am seeing the same issue of too many open files on Ubuntu 16.04 with dataloader. Any suggestions for a solution?

ptrblck · October 1, 2018, 4:44am

Have you tried @SimonW’s suggestion of increasing the file num limit?
A restart might also help in some cases in case your system holds a lot of files open.

TsuanYuan · October 1, 2018, 4:57am

Yes, I did and it does not complain now. But I am still concerned that it could happen if the limit is reached in some future cases. I wonder what causes the data loader to open that many files? Is it a bug?