Simple way to inverse normalize a batch of input variable

I’m trying to modify my image classifier by adding decoder and reconstruction loss as autoencoder.
I want to use the BCELoss which requires targets range from 0 to 1.
But my classifier has input normalization at the data loader as in usual so the input range is not fitted.
So I want to get it back to the original range by inverse normalize.

I’m working on cifar10, so I have mean(0.4914, 0.4822, 0.4465) and std(0.2023, 0.1994, 0.2010).
and the input shape is (batch, channel, H, W).

What is the best way to inverse normalize in this case?
Or, Is there any way to get both normalized and unnormalized input from the train loader?

y = (x - mean) / std

x = (y * std) + mean

Just do such an operation per-channel on your output:

# assuming x and y are Batch x 3 x H x W and mean = (0.4914, 0.4822, 0.4465), std = (0.2023, 0.1994, 0.2010)
x = y.new(*y.size())
x[:, 0, :, :] = y[:, 0, :, :] * std[0] + mean[0]
x[:, 1, :, :] = y[:, 1, :, :] * std[1] + mean[1]
x[:, 2, :, :] = y[:, 2, :, :] * std[2] + mean[2]
2 Likes

Thanks for the solution @smth!

If I can change the per-channel operation to one element-wise tensor multiplication by expanding the mean and std in proper way, then is it gonna be faster? Or since its graph, it will not affect much?

Something like below,

t_mean = torch.FloatTensor(mean).view(3,1,1).expand(3,H,W).contiguous().view(1,3,H,W)
t_std = torch.FloatTensor(std).view(3,1,1).expand(3,H,W).contiguous().view(1,3,H,W)
x = y * t_std.expand(B,3,H,W) +t_mean.expand(B,3,H,W)

I’m not sure what is most simple way to change the shape properly though…

because it’s only 3 channels, it shouldn’t matter either ways.

Hello,

I had the same problem, but came up with a somewhat different way to approach it which may be of interest. I have defined the following transform which lets you return both the normalised and the unnormalised version of the data from data loader:

class Branch(object):

    def __init__(self, transforms):
        self.transforms = transforms

    def __call__(self, img):
        outputs = []
        for t in self.transforms:
            outputs.append(t(img))
        return outputs

    def __repr__(self):
        format_string = self.__class__.__name__ + '('
        for t in self.transforms:
            format_string += '\n'
            format_string += '    {0}'.format(t)
        format_string += '\n)'
        return format_string

Use for example like this:

    train_set = datasets.MNIST(
        '../data/mnist',
        train=True,
        download=True,
        transform=Branch([
            transforms.Compose([
                transforms.ToTensor(),
                transforms.Normalize((0.1307,), (0.3081,))]),
            transforms.ToTensor()]))