ImageLoader doesn't cause OOM while customized DataLoader cause OOM

aliolion · April 14, 2019, 2:39am

Hi,
I’ve used PyTorch for my project. I run it in my GTX Nvidia with 4 GB of memory. I have a customized data loader, like the following code:

class LoadDataset(data.Dataset):

    def __init__(self, root, type_data):
        self.filenames = []
        self.root = root
        self.transform = TRANSFORM_IMG[type_data]
        self.classes = None
        for (path, dirs, files) in os.walk(self.root):
            self.classes = dirs
            break
        
        for c in self.classes:
            c_dir = os.path.join(self.root, c)
            walk = os.walk(c_dir).__next__()
            for k in walk[2]:
                if k.endswith('.png'):
                    self.filenames.append(os.path.join(c_dir, k))
 
        self.len = len(self.filenames)

    def __len__(self):
        return self.len

    def __getitem__(self, index):
        # Select sample
        image = Image.open(self.filenames[index]).convert("RGB")
        image = self.transform(image)
        label = int(self.filenames[index].split("\\")[1])
        path = self.filenames[index]
        sample = {'image': image, 'path': path, 'label': label}

        return sample

And actually, the format of my data is also available for using ImageLoader method.

When I tried with the following image transformation:

TRANSFORM_IMG = {
    'train':
        transforms.Compose([
                transforms.Resize(224),
                transforms.ColorJitter(brightness=0.4, contrast=0.4, saturation=0.4),
                transforms.RandomAffine(degrees=0, translate=(0, 0.001)),
                transforms.ToTensor(),
                transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                     std=[0.229, 0.224, 0.225])]),
    'validation':
        transforms.Compose([
                transforms.Resize(224),
                transforms.ToTensor(),
                transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                     std=[0.229, 0.224, 0.225])])
    }

My customized loader gives me a dump about Out Of Memory when I run on my GPU. However, the same method is performed with ImageLoader gives the process run without any complaining and I can finish my training.

So, does the ImageLoader has some optimization for consuming resources rather than my customized method? If I’m to use my customized method, what should I do to avoid OOM?
Thank you in advance.

nunenuh · April 14, 2019, 4:18am

my suggestion is to buy new GPU
but I have another suggestion for you, try to limit your batch size. Your problem is not in your custom dataset I think. The problem happen when batch iteration for validation dataset is loaded into gpu memory, and your gpu capacity does not fit with that…

aliolion · April 14, 2019, 5:12am

Hi,
Thank you for your suggestion. If only I can have more bucks, I would have grab a new GPU for sure…

I’ve also tried to reduce the batch size, however the behaviors are still the same, after a few iteration the training will eventually throw OOM when I used my customized Dataloader (class MyDataset(…)), however the ImageLoader seems to be able to handle this on its own.

So, my exact interest is the point than make the ImageLoader container works without OOM while my customized method failed when working under the same conditions

So, any idea what is it?

Thanks