Spent the better part of the day trying to script a couple of facial detection and recognition models for inference in hope of seeing some performance improvements and was very disappointed.
- Scripting is pretty restrictive, some very benign code don’t seem to compile properly and forced me to exclude from JIT compilation. In my case it was the handling of a list of tensors erroring as the JIT doesn’t know how to handle such objects.
- One of the simpler models actually scripted very easily but… showed only a degradation for the first couple of passes and then identical performance to python (cpython).
I was reading through this and was really hopeful but at this point (pytorch1.7.1/cuda11.2/cudnn8.05) with my relatively common pretrained models, I am not seeing anything positive. Has anyone been able to verify performance improvements? I have experience with lua/luajit which shows drastic speed enhancements.