Pythorch representation of gray scale image

Oussama_Bouldjedri · February 25, 2022, 10:02pm

I have a data with a single channel so the shape is numpy array of 250,164 as I want to use it as a gray scale image what would be the difference of :
method 1: add a dimension in numpy array

         data_numpy = np.expand_dims(data_numpy, axis=2)

and once on the training loop flip the dimensions to get the proper shape:

        data=data.permute(0, 3, 1,2)

method 2: expand the torch dimension before training :

        tensor_data = tensor_data.unsqueeze(dim=0)

ptrblck · February 25, 2022, 11:51pm

Both approaches should yield the same results.
The difference would be that e.g. tensor_data.unsqueeze(0) would be tracked by Autograd etc., but I assume that’s not your use case.

Oussama_Bouldjedri · February 25, 2022, 11:53pm

tracked by Auto grad would you please give more details it might help ?

ptrblck · February 25, 2022, 11:58pm

Here is a small example showing the unsqueeze is differentiable and won’t break the computation graph:

x = torch.randn(1, 1, requires_grad=True)
y = x.unsqueeze(0)
y.mean().backward()
print(x.grad)
# > tensor([[1.]])

However, this boils down to the differences between PyTorch and numpy.
Assuming you only care about creating a tensor in the right shape, both approaches would be fine.