Pytorch2.2 slower than Pytorch1.11

Hi I tried to run my model using pytorch2.2 but found it significantly slower than when it is running on pytorch1.11, without any code change/compiling.
I tried adding
# torch.set_float32_matmul_precision(‘high’)
# torch.backends.cuda.matmul.allow_tf32 = True
# torch.backends.cudnn.allow_tf32 = True
# torch.jit.enable_onednn_fusion(True)
but none of it is helpful.

One thing I found is that the CPU usage is much less when running with torch2.2 (500%) compared to torch1.11 (2300%) , any hint why this would happen? Thanks a lot.

GPU utility is 30% for torch2.2 and 60% for torch1.11
there is no “dataloader” in my code - it is for a real-time application.
I also tried
# torch.set_num_interop_threads(24)
# torch.set_num_threads(24)

Can you use a profiler tool to analyze the differences?

Hi, here is a cpu profiling information, seems all operations gets much slower.

seems that the MKL-DNN package is different, not sure if this can be the cause.

Please raises an issue and gives a way to reproduce it, so that code owner will analyze it, Thank you.