Reshaping tensor image

Ghostv · June 3, 2024, 11:23pm

Hi,

This is probably very basic, but i have the following issue… i have an rgb image read as a tensor (torchvision.io.read_image) with the shape (c,rows,columns). Now, i’m trying to reshape it into a (rows x columns,3) array which contains in each row the 3 channel values (the rgb color), but i haven’t been able to figure out a way to tell reshape() to read the values iterating over the channel dimension first. When i use img.reshape((4140*2647,3)), the order for reading the values is row-wise.

Any idea how to achieve it?

Thanks in advance!

vdw · June 4, 2024, 8:06am

Since you want to swap the dimensions, reshape() and view() are probably the wrong approaches anyway. Assuming your tensor is called X, you can do

X = X.permute(1, 2, 0)

Ghostv · June 4, 2024, 10:36am

I never ever said i wanted to ‘swap the dimensions’, i literally said i wanted to reshape.

I managed to do it with

y = x.flatten(start_dim=1,end_dim=2).t().reshape(x.shape[1]*x.shape[2],3)

Thanks anyway!

vdw · June 7, 2024, 1:05am

I still think you might(!) need to swap the dimensions first since you c dimensions is the first in the input and the last in the output. Sure, you can reshape it immediately, but you might mess up your data semantically (see here)

I would have no doubt that your solution is perfectly fine, I honestly can’t tell from just look at the line. I’m just wondering if something like this

X = X.permute(1, 2, 0)
X = X.reshape(-1, 3)  # not 100% sure if correct

might not be much easier to read/maintain.

Ghostv · June 11, 2024, 2:02am

Just tested it, both solutions are indeed equivalent :).

Thanks again!