ResNet Input Dimensions

seedship · June 14, 2020, 10:57pm

Does it make a difference if I input images to ResNet as (CxHxW) or (HxWxC)? In order to use Matplotlib imshow, it needs to be in HxWxC, so that is the format I am currently using. However, I introduced transforms to perform data augmentation, and that is returning imsages in CxHxW.

When using pretrained pytorch models, which layout is the best? Thanks!

Nikronic · June 14, 2020, 11:17pm

Hi,

PyTorch works only on CHW format although there are some experimental features for HWC. For matplotibl imshow, you can just convert tensor to numpy using tensor.numpy and then permute its channels.

For instance, if you are using PIL:

x = Image.open('cat.jpg')
transform = transforms.Compose([transforms.Resize([256, 256]),
									transforms.RandomCrop(224),
									transforms.ToTensor()])
new_x = transform(x)
plt.imshow(new_x.permute((1, 2, 0)))

and if new_x is a tensor, just add new_x.numpy().

Bests