Why is tensorboard reporting no TensorCores?

I have this simple code:

import torch
import torchvision.models as models
from torch.profiler import profile, record_function, ProfilerActivity


#https://pytorch.org/tutorials/intermediate/tensorboard_profiler_tutorial.html

if __name__ == '__main__':

    #https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html
    torch.set_float32_matmul_precision("high")    
    #same as
    #torch.backends.cuda.matmul.allow_tf32 = True

    model = models.resnet50().cuda()
    model.eval()
    inputs = torch.randn(5, 3, 224, 224).cuda()
   
    traceHandler = torch.profiler.tensorboard_trace_handler('trace_test')

    with profile(activities=[ProfilerActivity.CPU, ProfilerActivity.CUDA],
                 record_shapes=True,
                 profile_memory=True,
                 with_stack=True,
                 on_trace_ready=traceHandler,
                 ) as prof:
        with record_function("model_inference"):
            with torch.cuda.amp.autocast():
                model(inputs)

When I open generated image in TensorBoard, I can see:

Why are TensorCores not used? Is this problem with TensorBoard results and profile module?

I have RTX3080 Ti, torch 1.13.1 torch-tb-profiler 0.4.0 and CUDA 11.7. Python is 3.8.13.

I’m not 100% sure what is being reported by TensorBoard is accurate here. Could you check if toggling https://pytorch.org/docs/stable/backends.html#torch.backends.cudnn.torch.backends.cudnn.allow_tf32 is changing the performance of your workload?