Finding means and stds of a bunch of torch.Tensors (that are converted from ndarray images)

to_tensor = transforms.ToTensor()
img = to_tensor(train_dataset[0]['image'])
img

Converts my images values between 0 and 1 which is expected. It also converts img which is an ndarray to a torch.Tensor

Previously, without using to_tensor (which I need it now), the following code snippet worked (not sure if this is best way to find means and stds of train set, however now doesn’t work. How can I make it work?

image_arr = []

for i in range(len(train_dataset)):
    image_arr.append(to_tensor(train_dataset[i]['image']))

print(np.mean(image_arr, axis=(0, 1, 2)))
print(np.std(image_arr, axis=(0, 1, 2)))

The error is:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-147-0e007c030629> in <module>
      4     image_arr.append(to_tensor(train_dataset[i]['image']))
      5 
----> 6 print(np.mean(image_arr, axis=(0, 1, 2)))
      7 print(np.std(image_arr, axis=(0, 1, 2)))

<__array_function__ internals> in mean(*args, **kwargs)

~/anaconda3/lib/python3.7/site-packages/numpy/core/fromnumeric.py in mean(a, axis, dtype, out, keepdims)
   3333 
   3334     return _methods._mean(a, axis=axis, dtype=dtype,
-> 3335                           out=out, **kwargs)
   3336 
   3337 

~/anaconda3/lib/python3.7/site-packages/numpy/core/_methods.py in _mean(a, axis, dtype, out, keepdims)
    133 
    134 def _mean(a, axis=None, dtype=None, out=None, keepdims=False):
--> 135     arr = asanyarray(a)
    136 
    137     is_float16_result = False

~/anaconda3/lib/python3.7/site-packages/numpy/core/_asarray.py in asanyarray(a, dtype, order)
    136 
    137     """
--> 138     return array(a, dtype, copy=False, order=order, subok=True)
    139 
    140 

ValueError: only one element tensors can be converted to Python scalars

image_arr seems to be a python list. Can you tell us the shape of image_arr[0]?

If the shape of each element of image_arr looks like [height,width,channels] then the output of np.mean(image_arr,axis=(0,1,2)) should be an array of three values.

yes, I have defined image_arr myself to be a list by image_arr = []

did you check the error?

ValueError: only one element tensors can be converted to Python scalars

Please note that both of the following solutions are wrong in the sense that they create a vector of 800 values for mean while I only expect 3 values since I only have 3 channels (RGB)

image_arr = []

for i in range(len(train_dataset)):
    image_arr.append(to_tensor(train_dataset[i]['image']))
                     
mean = torch.mean(torch.stack(image_arr, dim=0), dim=(0, 1, 2))
std = torch.std(torch.stack(image_arr, dim=0), dim=(0, 1, 2))

mean.shape

and


image_arr = []
for i in range(len(train_dataset)):
       image_arr.append((to_tensor(train_dataset[i]['image'])).cpu().detach().numpy())

means = np.mean(image_arr, axis=(0, 1, 2))
stds = np.std(image_arr, axis=(0, 1, 2))
means.shape

This should work:

image_arr = []

for i in range(len(train_dataset)):
    image_arr.append(to_tensor(train_dataset[i]['image']))

image_arr = [x.unsqueeze_(0) for x in image_arr]
image_tens = torch.cat(image_arr, dim=0)

print(image_tens.mean((0,1,2)))

Basically, if you have a list of tensors, each of shape [height, width, channels], then you can add one more dimension to each tensor using unsqueeze(), to get each tensor in the list to be of shape [1, height, width, channels], then you can concatenate all those tensors to get a bigger tensor of size [len(image_arr), height, width, channels]. Now if the whole object is a tensor, it’s easy to calculate mean.

image_arr = []

for i in range(len(train_dataset)):
    image_arr.append(to_tensor(train_dataset[i]['image']))
                     
mean = torch.mean(torch.stack(image_arr, dim=0), dim=(0, 2, 3))
std = torch.std(torch.stack(image_arr, dim=0), dim=(0, 2, 3))

print(mean.shape)
print("mean is {} and std is {}".format(mean, std))

worked

and answer is:

torch.Size([3])
mean is tensor([0.3809, 0.3810, 0.3810]) and std is tensor([0.1127, 0.1129, 0.1130])
1 Like

Please check my own solution to see why 0, 1, 2 will not work and give a result of 800 dimensions

1 Like

Apologies, it’s image_tens, not image.tens, edited my response.

Also, it depends on the ordering of the dimensions. If your image shape is [channels, width, height], then you should take mean over axes 0,2,3 since 1 is channels.

Yes, as I said, maybe your images are of shape [channels, width, height] and not [width, height, channels]. Basically, the dimension having channels should be skipped while taking the mean. If the 1st dimension has channels, we can pass (0,2,3) to the mean function. If the 3rd dimension has channels, then we use (0,1,2) inside the mean function. Hope I made sense.