Multiple training jobs in a single GPU

Hi guys,

What is the best practice of running multiple training jobs in a single GPU? Do you run them parallelly or sequentially?

Based on my benchmark, running two training jobs simultaneously takes longer than running a single training job. But the GPU is not fully utilized when jobs are running sequentially, which is a wasteful use of resources.

Thanks