Finding means and stds of a bunch of torch.Tensors (that are converted from ndarray images)

Mona_Jalal · October 15, 2020, 5:02am

to_tensor = transforms.ToTensor()
img = to_tensor(train_dataset[0]['image'])
img

Converts my images values between 0 and 1 which is expected. It also converts img which is an ndarray to a torch.Tensor

Previously, without using to_tensor (which I need it now), the following code snippet worked (not sure if this is best way to find means and stds of train set, however now doesn’t work. How can I make it work?

image_arr = []

for i in range(len(train_dataset)):
    image_arr.append(to_tensor(train_dataset[i]['image']))

print(np.mean(image_arr, axis=(0, 1, 2)))
print(np.std(image_arr, axis=(0, 1, 2)))

The error is:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-147-0e007c030629> in <module>
      4     image_arr.append(to_tensor(train_dataset[i]['image']))
      5 
----> 6 print(np.mean(image_arr, axis=(0, 1, 2)))
      7 print(np.std(image_arr, axis=(0, 1, 2)))

<__array_function__ internals> in mean(*args, **kwargs)

~/anaconda3/lib/python3.7/site-packages/numpy/core/fromnumeric.py in mean(a, axis, dtype, out, keepdims)
   3333 
   3334     return _methods._mean(a, axis=axis, dtype=dtype,
-> 3335                           out=out, **kwargs)
   3336 
   3337 

~/anaconda3/lib/python3.7/site-packages/numpy/core/_methods.py in _mean(a, axis, dtype, out, keepdims)
    133 
    134 def _mean(a, axis=None, dtype=None, out=None, keepdims=False):
--> 135     arr = asanyarray(a)
    136 
    137     is_float16_result = False

~/anaconda3/lib/python3.7/site-packages/numpy/core/_asarray.py in asanyarray(a, dtype, order)
    136 
    137     """
--> 138     return array(a, dtype, copy=False, order=order, subok=True)
    139 
    140 

ValueError: only one element tensors can be converted to Python scalars

fadetoblack · October 15, 2020, 5:28am

image_arr seems to be a python list. Can you tell us the shape of image_arr[0]?

If the shape of each element of image_arr looks like [height,width,channels] then the output of np.mean(image_arr,axis=(0,1,2)) should be an array of three values.

Mona_Jalal · October 15, 2020, 5:29am

yes, I have defined image_arr myself to be a list by image_arr = []

Mona_Jalal · October 15, 2020, 5:30am

did you check the error?

ValueError: only one element tensors can be converted to Python scalars

Mona_Jalal · October 15, 2020, 5:43am

Please note that both of the following solutions are wrong in the sense that they create a vector of 800 values for mean while I only expect 3 values since I only have 3 channels (RGB)

image_arr = []

for i in range(len(train_dataset)):
    image_arr.append(to_tensor(train_dataset[i]['image']))
                     
mean = torch.mean(torch.stack(image_arr, dim=0), dim=(0, 1, 2))
std = torch.std(torch.stack(image_arr, dim=0), dim=(0, 1, 2))

mean.shape

and


image_arr = []
for i in range(len(train_dataset)):
       image_arr.append((to_tensor(train_dataset[i]['image'])).cpu().detach().numpy())

means = np.mean(image_arr, axis=(0, 1, 2))
stds = np.std(image_arr, axis=(0, 1, 2))
means.shape

fadetoblack · October 15, 2020, 5:46am

This should work:

image_arr = []

for i in range(len(train_dataset)):
    image_arr.append(to_tensor(train_dataset[i]['image']))

image_arr = [x.unsqueeze_(0) for x in image_arr]
image_tens = torch.cat(image_arr, dim=0)

print(image_tens.mean((0,1,2)))

Basically, if you have a list of tensors, each of shape [height, width, channels], then you can add one more dimension to each tensor using unsqueeze(), to get each tensor in the list to be of shape [1, height, width, channels], then you can concatenate all those tensors to get a bigger tensor of size [len(image_arr), height, width, channels]. Now if the whole object is a tensor, it’s easy to calculate mean.

Mona_Jalal · October 15, 2020, 5:51am

image_arr = []

for i in range(len(train_dataset)):
    image_arr.append(to_tensor(train_dataset[i]['image']))
                     
mean = torch.mean(torch.stack(image_arr, dim=0), dim=(0, 2, 3))
std = torch.std(torch.stack(image_arr, dim=0), dim=(0, 2, 3))

print(mean.shape)
print("mean is {} and std is {}".format(mean, std))

worked

and answer is:

torch.Size([3])
mean is tensor([0.3809, 0.3810, 0.3810]) and std is tensor([0.1127, 0.1129, 0.1130])

Mona_Jalal · October 15, 2020, 5:53am

Please check my own solution to see why 0, 1, 2 will not work and give a result of 800 dimensions

fadetoblack · October 15, 2020, 5:53am

Apologies, it’s image_tens, not image.tens, edited my response.

Also, it depends on the ordering of the dimensions. If your image shape is [channels, width, height], then you should take mean over axes 0,2,3 since 1 is channels.

fadetoblack · October 15, 2020, 5:55am

Yes, as I said, maybe your images are of shape [channels, width, height] and not [width, height, channels]. Basically, the dimension having channels should be skipped while taking the mean. If the 1st dimension has channels, we can pass (0,2,3) to the mean function. If the 3rd dimension has channels, then we use (0,1,2) inside the mean function. Hope I made sense.