Loading dataset into GPU

I’m using a dataset of audio feature vectors (each sample is represented with 40 features). At the moment I create a matrix in the init_ method of my custom dataset and push it to GPU:

self.frame_array_device = torch.from_numpy(self.frame_array)
self.frame_array_device.to(device)

the dataset getitem method is defined as follows:

def __getitem__(self, idx):
        return self.frame_array_device[idx, :]

Samples are drawn by the dataloader:

train_loader = torch.utils.data.DataLoader(
        train_dataset, batch_size=batch_size,shuffle=False)

in the training loop I double check that the features are on the GPU:
batch_features = batch_features.to(device)

However, training is embarassingly slow and CPU usage is at 100%…
I think that there is something wrong, could you give me some hint?
Thanks a lot

I am newer to ML so take what I say with a grain

Loading the data is what I think is the slowest part. I notice that between epochs my CPU is always 100-200% even thought my device prints as Device: Tesla P100-PCIE-16GB. Then, during a short portion of every epoch my GPU also goes to 100%.

Subscribed to see what experts have to say.

This line of code:

self.frame_array_device.to(device)

won’t push the data to the device, if you don’t reassign the result via:

self.frame_array_device = self.frame_array_device.to(device)

That being said, you could try to profile the data loading using the data_time object from the ImageNet example or alternatively you could create a random CUDATensor and just execute the training without the data loading at all to check the GPU utilization.