[CPU] Train network using float 16?

CPU workloads should support bfloat16 in autocast as described in the docs:

As shown in the CPU example section of torch.autocast, “automatic mixed precision training/inference” on CPU with datatype of torch.bfloat16 only uses torch.autocast.

I don’t know what the status of float16 support on CPU is and if it’s planned.

1 Like