Pytorch tranforms give me weird results

tfms = transforms.Compose([
    #transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])

I’m trying to resize my image to 224,224 and then centre crop it. Here is the original image and the image after the transform. Why are there 9 images after the transform is applied?

Image after the transform:-
Screenshot 2020-02-11 at 8.06.02 PM

Image before the transform

Could you print the shape and type of the image?
The result looks interleaved, which would happen, if you e.g. use a view instead of permute to swap the axes.

This is the code:-

tfms = transforms.Compose([
    #transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
    object_img_tensor = tfms(PIL.Image.fromarray(object_image))
    context_img_tensor = tfms(PIL.Image.fromarray(context_image))

Type of image before tranform is applied :- <class ‘numpy.ndarray’>
Shape of the image before tranform is applied :- (480, 640, 3)
Type of image after tranform is applied :- <class ‘torch.Tensor’>
Shape of the image after tranform is applied :- torch.Size([3, 224, 224])

The shapes look alright.
Could you post the code you are using to visualize the image?


Data was interleaved with ‘reshape’ function. Instead of ‘reshape’ function, use ‘permute’ function to shuffle the dimensions. Do like this


Ok this works. I’m just curious as to how the reshape function interleaves the data?

Let X be an input array of dimension [3][224][224]. When X is reshaped, it is first flattened as X[0][0][[0], X[0][0][1],…,X[0][0][223],X[0][1][0],…,X[0][223][223],X[1][0][0],…,X[2][223][223]. After that 3 elements are taken as group and forms an output array Y of dimension [224][224][3]. Hence, the output pixels are as Y[0][0][0] = X[0][0][0], Y[0][0][1] = X[0][0][1], Y[0][0][2] = X[0][0][2], Y[0][1][0] = X[0][0][3], …, Y[223][223][2] = X[2][223][223].