[CPU] Train network using float 16?

ptrblck · November 26, 2024, 9:13pm

CPU workloads should support bfloat16 in autocast as described in the docs:

As shown in the CPU example section of torch.autocast, “automatic mixed precision training/inference” on CPU with datatype of torch.bfloat16 only uses torch.autocast.

I don’t know what the status of float16 support on CPU is and if it’s planned.