But I am facing low CPU and GPU utilization. CPU keeps a low util like 30%, GPU goes to 70% for one second and keeps 0% most of the time.
Following is one of my implementations for dataloader. My another implementation is converting mxnet record to normal jpeg files beforehand and use PIL/opencv to read it, but it’s also very slow.
I tried num_worker, pin_memory, none of them significantly speed up the data loading. But if I skip the loading part and generate some random tensor, the GPU can reach 90% utilization.
So any suggestions?
import torch import torchvision.transforms as transforms import mxnet as mx from mxnet import recordio class MyDataset(torch.utils.data.Dataset): def __init__(self, mxnet_record = 'train.rec', mxnet_idx = 'train.idx'): self.data = recordio.MXIndexedRecordIO(mxnet_idx, mxnet_record,'r') self.transform = transforms.Compose([transforms.RandomHorizontalFlip(), transforms.ToTensor() ]) def __len__(self): return 3804846 def __getitem__(self, index): header, s = recordio.unpack(self.data.read_idx(index+1)) image = mx.image.imdecode(s).asnumpy() label = int(header.label) image = self.transform(image) return image, torch.tensor(label, dtype = torch.long)