When I use djl0.21.0+torch1.13.1 on ARM machines, the inference speed is normal. However, when using djl0.34.0+torch2.5.1, the inference speed doubles (becomes twice as slow). Downgrading the torch version to 2.1.2, 2.3.0, etc. (while synchronously downgrading djl versions) is ineffective. This should be caused by insufficient ARM optimization in torch2.x versions. What specifically causes this issue, and how can it be resolved?
I’m facing the same issue. I tried using version 2.7.1, but it still results in higher latency compare to the version 1.13.1. Any solution around this issue?