Yes, either model or data parallelism. See also Model parallelism in Multi-GPUs: forward/backward graph