Convert image tensor to numpy image array

Hi, let’s say I have a an image tensor (not a minibatch), so its dimensions are (3, X, Y).
I want to convert it to numpy, for applying an opencv manipulation on it (writing text on it).
Calling .numpy() works fine, but then how do I rearrange the dimensions, for them to be in numpy convention (X, Y, 3)?
I guess I can use img.transpose(0, 1).transpose(1, 2) but just wondering if there’s any smoother bridge to numpy.

2 Likes

I have below implementation which can flatten and unflatten both the net itself.Hopefully you can use some of that.

import numpy as np

#############################################################################
# Flattening the NET
#############################################################################
def flattenNetwork(net):
    flatNet = []
    shapes = []
    for param in net.parameters():
        #if its WEIGHTS
        curr_shape = param.cpu().data.numpy().shape
        shapes.append(curr_shape)
        if len(curr_shape) == 2:
            param = param.cpu().data.numpy().reshape(curr_shape[0]*curr_shape[1])
            flatNet.append(param)
        elif len(curr_shape) == 4:
            param = param.cpu().data.numpy().reshape(curr_shape[0]*curr_shape[1]*curr_shape[2]*curr_shape[3])
            flatNet.append(param)
        else:
            param = param.cpu().data.numpy().reshape(curr_shape[0])
            flatNet.append(param)
    finalNet = []
    for obj in flatNet:
        for x in obj:
            finalNet.append(x)
    finalNet = np.array(finalNet)
    return finalNet,shapes


#############################################################################
# UN-Flattening the NET
#############################################################################
def unFlattenNetwork(weights, shapes):
    #this is how we know how to slice weights

    begin_slice = 0
    end_slice = 0
    finalParams = []
    #print(len(weights))
    for idx,shape in enumerate(shapes):
        if len(shape) == 2:
            end_slice = end_slice+(shape[0]*shape[1])
            curr_slice = weights[begin_slice:end_slice]
            param = np.array(curr_slice).reshape(shape[0], shape[1])
            finalParams.append(param)
            begin_slice = end_slice
        elif len(shape) == 4:
            end_slice = end_slice+(shape[0]*shape[1]*shape[2]*shape[3])
            curr_slice = weights[begin_slice:end_slice]
            #print("shape: "+str(shape))
            #print("curr_slice: "+str(curr_slice.shape))
            param = np.array(curr_slice).reshape(shape[0], shape[1], shape[2], shape[3])
            finalParams.append(param)
            begin_slice = end_slice
        else:
            end_slice = end_slice+shape[0]
            curr_slice = weights[begin_slice:end_slice]
            param = np.array(curr_slice).reshape(shape[0],)
            finalParams.append(param)
            begin_slice = end_slice
    finalArr = np.array(finalParams)
    return np.array(finalArr)
 flat_weights,shapes=flattenNetwork(model)
unFlattenNetwork(flat_weights,shapes) --Gives you Numpy n-dimensional array in your case the image which you can directly assign to a variable

You have to permute the axes at some point. Usually I do: x.permute(1, 2, 0).numpy() to get the numpy array.
If I recall correctly, np.transpose should also take multiple axis indices.

As an alternative, you could use a transform from torchvision, e.g. torchvision.transforms.ToPILImage()(x) and maybe use a PIL function to draw on your image.

13 Likes

Some excellent ideas here, thanks

How to reverse the action of .unsqueeze(0)
I have tried to use
torchvision.transforms.ToPILImage()(x)

to change tensor to PIL .
In 3rd dimension , it works correctly giving right output

torch.Size([3, 224, 224])
(224, 224, 3)

but when using 4 dimension tensor gives error saying
torch.Size([1, 3, 224, 224])
ValueError: pic should be 2/3 dimensional. Got 4 dimensions.

How can i reverse tensor [1, 3, 224, 224] to [3, 224, 224] ?
Thank you for helping :slight_smile:

tensor = tensor.squeeze(0) should work.

4 Likes

Oh thank you , I hadnt seen if squeeze() before. Now squeeze() working. :slight_smile:

if x is a tensor image, you can simply do this using x[0], which will give you [3,224,224].

It seems that you have to use np.swapaxes (instead of transpose). If you have a tensor image ten [3, 32, 32], then:
img=ten.numpy()
img=np.swapaxes(img,0,1)
img=np.swapaxes(img,1,2)
will convert it to numpy image img [32, 32, 3].

Very Very useful Tips!
I love your gorgeous comment! :slightly_smiling_face:
Thx!!

1 Like

Usually, tensor images are float and between 0 to 1 but np images are uint8 and pixels between 0 to 255. So you need to do an extra step:
np.array(x.permute(1, 2, 0)*255, dtype=np.uint8)