Is there a backend difference when running a scripted model in python vs c++?

Greetings. A couple of questions regarding x86 inference backend(s):

  1. Assume that I produced a model via torch.jit.script(). Is it correct to think that essentially the same JIT runtime + BLAS implementation are used regardless of whether this model is subsequently evaluated via the python or the c++ interface? (assuming that the c++ app is compiled and linked against lib{torch, torch_cpu, c10}.so that are part of the pytorch distribution)

Put differently, other than being able to run in a python-less environment, are there benefits to doing model inference via the c++ API?

  1. I read everything I could find about fbgemm but remain confused as to when it actually kicks in. Does it get used only if the model contains fused or quantized ops?

Thank you.