Does torch.utils.data.DataLoader processing disk i/o?

In the above tutorial, there is a code as below.
Does this code load only mini-batch data from the disk after running “dataiter.next()”?
Or, does this code load the entire dataset from the disk to the memory ?

transform=transforms.Compose([transforms.ToTensor(),
                              transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)),
                             ])
trainset = torchvision.datasets.CIFAR10(root='./data', train=True, download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=4, 
                                          shuffle=True, num_workers=2)

testset = torchvision.datasets.CIFAR10(root='./data', train=False, download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=4, 
                                          shuffle=False, num_workers=2)

dataiter = iter(trainloader)
images, labels = dataiter.next()

Once you create an iterator, it will start loading the batches and accumulating them into a queue, but it’s bounded, so it will never load the whole dataset. Once the queue fills up the workers will wait until you take out some data. Since you’re using num_workers=2 the data loading happens asynchronously with your script execution, and you have 2 background processes that fill up the queue.

1 Like

can you please explain what is dataiter , what information is in it?
based on my understanding iter(trainloader) interate the data but how it does it? can we specify it?
and then, what dataiter.next() exactly do?