How to speed up using DataLoader

I had a dataset including about a million of rows. Before, I read the rows, preprocessed data and created a list of rows to be trained. Then I defined a Dataloader over this data like:

train_dataloader = torch.utils.data.DataLoader(mydata['train'],
        batch_size=node_batch_size,shuffle=shuffle,collate_fn=data_collator)

Preprocessing could be time consuming, so I thought to define an IterableDataSet with __iter__ function. Then I could define my Dataloader like:

train_dataloader = torch.utils.data.DataLoader(myds['train'],
        batch_size=node_batch_size,shuffle=shuffle,collate_fn=data_collator)

However, still to begin training it seems that it calls my preprocessing function and creates an Iteration over it. So, I wonder what is the use of this dataloader. it seems I didn’t gain much speed up.

Please guide me how could I use speed up in this case?