Hi,
I’ve got a problem with memory leak during training. I suspect the main cause of that problem is Dataset created by using torchvision.datasets.ImageFolder, (when I used torchvision.datasets.CIFAR10 instead of my dataset the problem does not occur) . I’ve tried to find a solution on similar topics.
Here is my dataset
To be honest I checked it on task manager. On linux I get runtime error during training (I suspected that it was related with small ram capacity of my gpu, but when I use CPU my RAM fills up during a training )
In your screenshot, it seems like there is no leak, just high usage.
If you use many workers or if your dataset has larger images or larger batchsize, then the memory needed to load the data will be bigger.