Split Single GPU

peterjc123 (Pu Jiachen) August 23, 2019, 4:29am 17

What about starting three web servers with different ports?

Running inference for 3 GPT2 models concurrently is slower than sequentially. How to improve?