Why cpp extension is faster; what is the overhead?

Hi,
I’m reading the cpp extension tutorial:

https://github.com/goldsborough/tutorials/blob/6e4a693f4ad056ac837b44db839822e29a8be3a1/advanced_source/cpp_extension.rst.

In that LLTM example, it looks like that using cpp extension leads to faster forwarding. I don’t actually expect that much speed gap.

What is the reason?