Channel ordering [B, H, W, C] or [B, W,H,C]

PyTorch uses channels-first by default and allows you to transform the input as well as model parameters to channels-last as described here, which could be beneficial for mixed-precision training using TensorCores.

You can visualize the images using any library capable of it (e.g. PIL or matplotib).