Torchvision normalize - how it operates on tuple of means/sds?

jchaykow · September 9, 2019, 5:47am

I don’t understand how this transform works from torchvision. Ultimately I want to build a custom normalize class so I need to figure out how this works first.

Here in the docs it describes the init like this:

def __init__(self, mean, std, inplace=False):
        self.mean = mean
        self.std = std
        self.inplace = inplace

And when I pass these parameters usually (not custom class) I pass them as a list or tuple for each channel:

transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])

But if I look at the call:

return F.normalize(tensor, self.mean, self.std, self.inplace)

All this passes the tuple to is F.normalize() which only accepts a single value for the p parameter.

The class must iterate through the channels somehow to allow this to be implemented but how does it do this and how can I implement it in custom class?

Based on this tutorial, I would describe it like this:

class Normalize(object):
    """Convert ndarrays in sample to Tensors."""
    
    def __init__(self, mean, std, inplace=False):
        self.mean = mean
        self.std = std
        self.inplace = inplace

    def __call__(self, sample):
        image, landmarks = sample['image'], sample['landmarks']
        return {'image': F.normalize(image, self.mean, self.std, self.inplace),
                'landmarks': landmarks}

But this does not work because it does not go through each channel.