Model is getting slower on CPU when I increased the number of threads

During Inference, I noticed PyTorch is utilizing only 50% of my CPU threads. Then I tried to increase the number of threads to 10, 12, and 16 (my CPU has 16 threads) by calling set_num_threads. Although setting the number of threads to 16 makes all threads busy, the model becomes slower. even when no other processes are running.

Is it a bug?
Could you please explain why I cannot utilize all my threads?

Too many threads might decrease the overall performance. E.g. if the workload for each thread is too small, you’ll suddenly notice the overhead of each thread.