Distributed Training and data parallel

How can write multi gpu distributed training code for any Training/Evaluation that supports only single GPU.
Training/Evaluation and loss