Why did we reshape them as wouldn’t broadcasting take care of it?
My current understanding is that each of the three numbers in the tensors defined above are supposed to normalize a single dimension (one each for height, width and classes), is this okay?
It might be a bit silly, but I’m having a hard time getting an intuition in 3D. Any help will be much appreciated, thanks!
The complete init method is
def __init__(self, mean, std):
# .view the mean and std to make them [C x 1 x 1] so that they can
# directly work with image Tensor of shape [B x C x H x W].
# B is batch size. C is number of channels. H is height and W is width.
self.mean = torch.tensor(mean).view(-1, 1, 1)
self.std = torch.tensor(std).view(-1, 1, 1)
For each image they are taking mean and std per channel so that it can be used for normalization. Normalizing the dimensions don’t sound so appropriate actually we normalize the data within each channel or you can say that we normalize the feature map and we do so per channel that’s why its getting a shape like that. Hope you got me.
The answer to your question lies in the code where they have used this mean. Actually there are many ways to write the code. It may be so that this type of reshaped array is being used in there case. You proceed your reading and you will get your answer.
You don’t have to be sorry. Anyone can be stuck and always feel free to share your doubt with the community.