As far as I understand, using PyTorchs native AMP performs (part) of the model training in half precision. Does it at the same time also make sense to force the dataset (e.g. ImageNet) to be in half-precision for additional speed-ups or is that already happening automatically as well when using AMP?
autocast will transform data to a lower-precision
bfloat16 depending on the used setup) for eligible operations for you. There is no need to manually transform data or any input to this