Train several networks simultaneously on one GPU

I have only one GPU and I build two models. I want to train the two models simultaneously on GPU. Currently I am using for loop and it is slow. Is it possible to do it faster? Thanks!