How to parallelize evaluation on a single GPU

Hi everyone. I have a nn.Module object which has been written such that it can take one data point (sample) at a time instead of a batch and can make some calculations about it. The weights for this model are trained and I don’t plan to update them so I plan to use this mainly just for evaluation.

It is non-trivial for me to further vectorize the code to permit evaluation of multiple sentences as a batch. I still have a good chunk of GPU memory as well as processing power left when I evaluate one sentence at a time. What would be an effective way to parallelize so as to make the best use of the hardware.

One immediate idea I can think of is using the torch multiprocessing module but it looks kinda hacky. Any other alternatives?

Refer to Split Single GPU

1 Like

Thanks! will check it out and post here if it helps.