I’m executing a network that I’ve already trained.
It runs on a large set of images, for each one I want to send it through the network, do some post-processing, and save it to a file.
My problem is that the post-processing takes some time (~1-2 sec per image), and because of that, the GPU is starving for the next image, and the overall throughput is lowered.
What is the correct way to delegate the post-processing? Maybe something similar to the way DataLoader delegates the pre-processing?