How to load all data into GPU for training

I am currently using nn.DataParllel but I will give DDP a try.