I am reading the code of
PyTorch adopts threading.Thread to manage _pin_memory,. pytorch/dataloader.py at master · pytorch/pytorch · GitHub; However, it leverages mutliprocessing.Process to manage dataloader worker pytorch/dataloader.py at master · pytorch/pytorch · GitHub.
Any considerations about this ?
Does this section answer your question?
# A `threading.Event` for a similar purpose to that of
# `workers_done_event`, but is for the `pin_memory_thread`. The reason
# that separate events are needed is that `pin_memory_thread` reads from
# the output queue of the workers. But the workers, upon seeing that
# `workers_done_event` is set, only wants to see the final `None`, and is
# not required to flush all data in the output queue (e.g., it may call
# `cancel_join_thread` on that queue if its `IterableDataset` iterator
# happens to exhaust coincidentally, which is out of the control of the
# main process). Thus, since we will exit `pin_memory_thread` before the
# workers (see below), two separete events are used.