How does pytorch implements backward process?

Because there is a static apply function that is called when you do Scatter.apply() and a non-static one that is called on an instance of scatter: Scatter().apply().