PyTorch uses channels-first by default and allows you to transform the input as well as model parameters to channels-last as described here, which could be beneficial for mixed-precision training using TensorCores.
You can visualize the images using any library capable of it (e.g. PIL or matplotib).