I was going through the neural style transfer program using pytorch and have a doubt in the usage of view in the init method of the Normalization class. Just before this class we defined
cnn_normalization_mean = torch.tensor([0.485, 0.456, 0.406]).to(device)
cnn_normalization_std = torch.tensor([0.229, 0.224, 0.225]).to(device)
And inside the __init__method were the lines
self.mean = torch.tensor(mean).view(-1, 1, 1)
self.std = torch.tensor(std).view(-1, 1, 1)
Why did we reshape them as wouldn’t broadcasting take care of it?
My current understanding is that each of the three numbers in the tensors defined above are supposed to normalize a single dimension (one each for height, width and classes), is this okay?
It might be a bit silly, but I’m having a hard time getting an intuition in 3D. Any help will be much appreciated, thanks!
Edit:
The complete init method is
def __init__(self, mean, std):
super(Normalization, self).__init__()
# .view the mean and std to make them [C x 1 x 1] so that they can
# directly work with image Tensor of shape [B x C x H x W].
# B is batch size. C is number of channels. H is height and W is width.
self.mean = torch.tensor(mean).view(-1, 1, 1)
self.std = torch.tensor(std).view(-1, 1, 1)