From the official pytorch document (http://pytorch.org/docs/notes/cuda.html#use-pinned-memory-buffers), it seems that by pinning your batch in cpu memory, the data transfer to GPU can be much faster.
Then comes the question: why by default pin_memory is False in DataLoader? I tried to recall the minimal knowledge learned from operation system classes. Does pin-memory indicate that once a batch is pinned, it will always stay in the memory until the process ends?