PyTorch 2.5.1 has poor model inference performance on ARM machines

When I use djl0.21.0+torch1.13.1 on ARM machines, the inference speed is normal. However, when using djl0.34.0+torch2.5.1, the inference speed doubles (becomes twice as slow). Downgrading the torch version to 2.1.2, 2.3.0, etc. (while synchronously downgrading djl versions) is ineffective. This should be caused by insufficient ARM optimization in torch2.x versions. What specifically causes this issue, and how can it be resolved?