When broadcasting tensors:RuntimeError: The size of tensor a (12) must match the size of tensor b (3) at non-singleton dimension 2

enterthevoidf22 · November 21, 2020, 6:25pm

i want to standardize images.
let’s say i have a tensor of shape [20000,3,12,12].

#images is my tensor of said shape
images = (images - images.mean(dim = [1,2,3])) / images.std(dim=[1,2,3])
RuntimeError: The size of tensor a (12) must match the size of tensor b (20000) at non-singleton dimension 2

follow up question:
if i want to standardize each color matrix separately, meaning for each image i extract mean and std of each color and subtract by that separately - how do i do that?

i tried using the built-in normalization transform but got another error i couldn’t pass -

img = img.to(dtype=torch.float64)
img_mean = img.mean(dim=[1,2])
img_std = img.std(dim=[1,2])
img = TF.normalize(img,mean=[img_mean[0],img_mean[1],img_mean[2]],std=[img_std[0],img_std[1],img_std[2]])
        return {'image':img,'target':target}

ValueError: std evaluated to zero after conversion to torch.float64, leading to division by zero.

most searches advise changing the dtype if the image but as can be seen i did that and still the error occurs

ptrblck · November 23, 2020, 11:49am

You can pass the keepdims argument to use broadcasting:

images = (images - images.mean(dim = [1,2,3], keepdims=True)) / images.std(dim=[1,2,3], keepdims=True)

Wouldn’t that be the standard normalization applied in batchnorm layers?
If so, you could use your code and use dim=[0, 2, 3] instead.