I’m trying to resize my image to 224,224 and then centre crop it. Here is the original image and the image after the transform. Why are there 9 images after the transform is applied?
Could you print the shape and type of the image?
The result looks interleaved, which would happen, if you e.g. use a view instead of permute to swap the axes.
Type of image before tranform is applied :- <class ‘numpy.ndarray’>
Shape of the image before tranform is applied :- (480, 640, 3)
Type of image after tranform is applied :- <class ‘torch.Tensor’>
Shape of the image after tranform is applied :- torch.Size([3, 224, 224])
Let X be an input array of dimension [3][224][224]. When X is reshaped, it is first flattened as X[0][0][[0], X[0][0][1],…,X[0][0][223],X[0][1][0],…,X[0][223][223],X[1][0][0],…,X[2][223][223]. After that 3 elements are taken as group and forms an output array Y of dimension [224][224][3]. Hence, the output pixels are as Y[0][0][0] = X[0][0][0], Y[0][0][1] = X[0][0][1], Y[0][0][2] = X[0][0][2], Y[0][1][0] = X[0][0][3], …, Y[223][223][2] = X[2][223][223].