CUDA out of memory sometimes

I encountered a strange error. When I run a program, I get an error that is cuda out of memory. When I run the same program on the same machine in a few days, the error disappears. I just change num_workers from 10 to 25, i don’t think it will offer a help to this problem. Can anyone explain why this kind of thing happens?Thanks in advance.

Each worker is a process that copies its data from the main process afaik. So, it is normal that the GPU memory usage to increase with number of workers.

As long as you are loading the data into the system RAM (not the GPU memory), the number of workers shouldn’t yield a CUDA OOM issue.

Is this error reproducible with num_workers=25 or are you seeing this issue som3times even with 25 workers?