How do I know that Tensor Cores used in PyTorch (for FP16, bFloat16, INT8)?

From PyTorch documentation it is very to know if a model is using Tensor Cores or not (for FP16, bFloat16, INT8)?.

What I know so far:

  • FP32 will not run on Tensor Cores, since it is not supported
  • Enabling TF32 for PyTorch will run your model in TF32 on Tensor Cores
  • Running (Automatic) Mixed Precision will run on Tensor Cores
  • Converting a model to FP16, bfloat16 it is unclear if it is/will using Tensor Cores or not! According to Pytorch forums:

PyTorch is using Tensor Cores on volta GPU as long as your inputs are in fp16 and the dimensions of your gemms/convolutions satisfy conditions for using Tensor Cores (basically, gemm dimensions are multiple of 8, or, for convolutions, batch size and input and output number of channels is multiple of 8). For Ampere and newer, fp16, bf16 should use Tensor Cores for common ops and fp32 for convs (via TF32). So how do I know if CUDA cores are not used?

  • What about INT8?

This blog post from NVIDIA explains how to use Tensor cores. This should answer your question partially as to when will tensor cores be used:

If you wanna ensure that tensor cores were used then you can use NVPROF as mentioned here: