Latency of inference with multithreads and one thread on cpu

hello, i use libtorch version=1.10.0 to inference my model on cpu(cenos, x86) with one thread, the latency is ~30ms while the latency is ~70ms when setting threads=64(system has 64 cores). Note: each thread has independent computing resources,and only using one thread for inference. eg. at::set_num_threads(1), is that normal? Subconsciously think this is normal because the CPU becomes busy, but I wonder if there is a more detailed explanation?thanks