Gpu utilization is less

kunasiramesh · January 22, 2019, 7:41am

Hi,
I am training a simple dnn with 3 hidden layers of 512 nodes each. I have loading data using dataloader with batch size = 100, no.of workers = 128.
I have checked GPU usage with nvidia-smi, it is always showing GPU utilization 1%.

How to increase the GPU utilization.

Thanks

Tony-Y · January 22, 2019, 7:56am

Why did you use 128 workers? Why not num_workers=0?

kunasiramesh · January 22, 2019, 9:36am

If num_workers is more dataloader will load the data faster

Tony-Y · January 22, 2019, 10:49am

Did you check whether it works as you expect?

kunasiramesh · January 22, 2019, 11:05am

yes, I checked with increasing nu_workers

Tony-Y · January 22, 2019, 11:25am

How large is the dataset you used? How long is the input?

kunasiramesh · January 22, 2019, 11:32am

Total 16283 batches, each batch size is 100*1257

kunasiramesh · January 22, 2019, 11:43am

I am using custom dataloader to load the data in batches

cosmin.pascaru · January 22, 2019, 11:47am

Dataloader creates a new PROCESS for every worker, and the dataset has to be “copied” to every worker. Depending on whether you’re on Windows or Linux (spawning a process in Windows is much more expensive than forking one in Linux), and how the dataset stores it’s data (Tensors don’t seem to get copied from what I’ve tested, but Python structures, yes), you might have a very high overhead for creating the processes.

Unless you’re working with some supercomputer, I believe 8 workers is more than enough.
Also, most importantly, check your RAM usage, if your OS starts swapping memory to disk, it might get extremely slow.

Reduce the number of workers, and check for improvements.

Your problem might be that there simply isn’t that much to do for the GPU (if your model is very small, or your batch size is very small, for example), not necessarily that it’s waiting for data.

An easy way to check is to look for “pits” in GPU usage: if there are times the GPU usage suddenly decreases, it’s probably waiting for data (although you probably won’t be able to see this now, as your “max” appears to be 1%)

kunasiramesh · January 22, 2019, 11:54am

Hi cosmin.pascaru,
I changed the num_workers to 8. Now also it is still showing constantly 1% GPU usage. Memory usage of CPU is listed below. How to stop swapping memory to disk?
Mem[|||||||||||||||8.41G/31.1G]
Swp[|||||||||||||||17.6G/31.7G]

Tony-Y · January 22, 2019, 12:08pm

If the data type is float32, the size of the entire dataset is about 8GB, right? If so, you should put all data on memory, and set num_workers=0.

kunasiramesh · January 22, 2019, 12:11pm

Hi Tony,
I have tried with num_workers=0 also. Till it is 1% usage only

Tony-Y · January 22, 2019, 12:13pm

Did you put all data on memory?

kunasiramesh · January 22, 2019, 12:15pm

In which memory? I have set num_workers=0 in dataloader. All the data in the local storage, I am loading the data using dataloader

Tony-Y · January 22, 2019, 12:17pm

Sorry. It is CPU memory.

kunasiramesh · January 22, 2019, 12:18pm

If we use dataloader it will load data into CPU memory

Tony-Y · January 22, 2019, 12:26pm

You should use a custom dataset as follows:

class CustomDataset(Dataset):
    def __init__(self, path):
        super(CustomDataset, self).__init__()
        self.data = loadFrom(path)

   def __getitem__(self, index):
        return self.data[index]

   def __len__(self):
        return len(self.data)

So, the time of loading data from the local storage is eliminated.

kunasiramesh · January 22, 2019, 1:50pm

Yes I am also doing same thing, but instead of loading the data in initialization i am loading the data in “getitem”.

Tony-Y · January 22, 2019, 2:00pm

Loading should not be at getitem. The mini batch creation becomes very slow if loading is at getitem.

kunasiramesh · January 23, 2019, 4:22am

I am not creating any mini batch