Pin_memory and sampler

spnova12 · June 13, 2018, 9:38am

torch.utils.data.DataLoader(
train_dataset,
batch_size=args.batch_size,
shuffle=(train_sampler is None),
num_workers=args.workers,
pin_memory=True,
sampler=train_sampler)

What are ‘pin_memory’ and ‘sampler’ here?
I could not understand this explanation.

“sampler (Sampler, optional): defines the strategy to draw samples from the dataset. If specified, shuffle must be False.”

And, Is it true that using ‘pin_memory=True’ will speed up the GPU operation, but will it run out of memory soon?

ptrblck · June 13, 2018, 10:50am

The sampler implements a strategy to sample observations from the indices.
Have a look at the docs for the provided samplers.

There is also an explanation regarding pin_memory.

Host to GPU copies are much faster when they originate from pinned (page-locked) memory.

This will not speed up the GPU operation, but the copies between the host and the GPU.
Also, it shouldn’t run out of memory.