I try to train MobileFaceNet using MS1M-IBUG (85K ids/3.8M images)
But I am facing low CPU and GPU utilization. CPU keeps a low util like 30%, GPU goes to 70% for one second and keeps 0% most of the time.
Following is one of my implementations for dataloader. My another implementation is converting mxnet record to normal jpeg files beforehand and use PIL/opencv to read it, but it’s also very slow.
I tried num_worker, pin_memory, none of them significantly speed up the data loading. But if I skip the loading part and generate some random tensor, the GPU can reach 90% utilization.
So any suggestions?
import torch
import torchvision.transforms as transforms
import mxnet as mx
from mxnet import recordio
class MyDataset(torch.utils.data.Dataset):
def __init__(self, mxnet_record = 'train.rec', mxnet_idx = 'train.idx'):
self.data = recordio.MXIndexedRecordIO(mxnet_idx, mxnet_record,'r')
self.transform = transforms.Compose([transforms.RandomHorizontalFlip(),
transforms.ToTensor()
])
def __len__(self):
return 3804846
def __getitem__(self, index):
header, s = recordio.unpack(self.data.read_idx(index+1))
image = mx.image.imdecode(s).asnumpy()
label = int(header.label)
image = self.transform(image)
return image, torch.tensor(label, dtype = torch.long)