How does Fork/Join work in pipeline.py

Try to understand some implementation details in pipeline parallelism.

In pipeline.py, phony is used to create dependency to ensure that batches[i-1] is executed after batches[i] in # backpropagation by an explicit dependency.

Question 1). fork returned both detached tensor, how the dependency can be created in the join operation?
Question 2). Both batch[I] and batch[I-1] are replaced by detached tensor, how the system can track the following graph for example, in following copy operation (_copy(batches[i], prev_stream, next_stream))

def _depend(fork_from: Batch, join_to: Batch) → None:
fork_from_idx = fork_from.find_tensor_idx()
join_to_idx = join_to.find_tensor_idx()

fork_from[fork_from_idx], phony = fork(fork_from[fork_from_idx])
join_to[join_to_idx] = join(join_to[join_to_idx], phony)

I understand this is deprecated, but would like to understand the logic. Appreciate it!