How can l apply transforms.Normalize on DataLoader

Hello,

l have dataset got from numpy. l would like to apply transform.Normalize

    normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225])

which is basically applied on ImageFolder as follow :

  train_loader = torch.utils.data.DataLoader(
        datasets.ImageFolder(traindir, transforms.Compose([
            transforms.RandomSizedCrop(224),
            transforms.RandomHorizontalFlip(),
            transforms.ToTensor(),
            normalize,
])),

However l have numpy dataset so l have access to my dataset as follow :

    data_train = torch.from_numpy(data_train).expand(-1,3,-1,-1)
    labels_train=torch.from_numpy(output_le[:len(labels_train)])
    train_data = torch.utils.data.TensorDataset(data_train.float(), labels_train) 

   train_loader = torch.utils.data.DataLoader(
        train_data, batch_size=args.batch_size, shuffle=(train_sampler is None),
        num_workers=args.workers, pin_memory=True, sampler=train_sampler)

How can l apply

    normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225])

In this case ?

Thank you,

I’m not sure you can apply a transform on DataLoader. Maybe you could subclass TensorDataset and add a transform argument to the constructor, then override __getitem__ to call the parent’s __getitem__ and apply the transform to the returned data.

Or, in your case, why don’t you just normalize the data before passing it to TensorDataset?

@simopal6 thank you. Yes , l think it is the simplest way to do that.
l’m also wandering why there are 4 values for means and 4 values for std ?

mean=[0.485, 0.456, 0.406]
std=[0.229, 0.224, 0.225]

Thank you

It’s actually 3 and 3 :slight_smile: It’s because each color channel in the input image is normalized according to its own mean and standard deviation.

2 Likes