In pytorch, there is no traditional sense of tape. In the engine, we queue up the backward jobs as soon as all its dependencies are satisfied. So it is not reversing a sequence of operations, but still executing a topological sorted order. This way we can use multi thread to execute these tasks easily (if they don’t conflict with one another).