I’m trying to modify my image classifier by adding decoder and reconstruction loss as autoencoder.
I want to use the BCELoss which requires targets range from 0 to 1.
But my classifier has input normalization at the data loader as in usual so the input range is not fitted.
So I want to get it back to the original range by inverse normalize.
I’m working on cifar10, so I have mean(0.4914, 0.4822, 0.4465) and std(0.2023, 0.1994, 0.2010).
and the input shape is (batch, channel, H, W).
What is the best way to inverse normalize in this case?
Or, Is there any way to get both normalized and unnormalized input from the train loader?
If I can change the per-channel operation to one element-wise tensor multiplication by expanding the mean and std in proper way, then is it gonna be faster? Or since its graph, it will not affect much?
Something like below,
t_mean = torch.FloatTensor(mean).view(3,1,1).expand(3,H,W).contiguous().view(1,3,H,W)
t_std = torch.FloatTensor(std).view(3,1,1).expand(3,H,W).contiguous().view(1,3,H,W)
x = y * t_std.expand(B,3,H,W) +t_mean.expand(B,3,H,W)
I’m not sure what is most simple way to change the shape properly though…
I had the same problem, but came up with a somewhat different way to approach it which may be of interest. I have defined the following transform which lets you return both the normalised and the unnormalised version of the data from data loader:
class Branch(object):
def __init__(self, transforms):
self.transforms = transforms
def __call__(self, img):
outputs = []
for t in self.transforms:
outputs.append(t(img))
return outputs
def __repr__(self):
format_string = self.__class__.__name__ + '('
for t in self.transforms:
format_string += '\n'
format_string += ' {0}'.format(t)
format_string += '\n)'
return format_string