In multi-GPU usecase I start one triton server on two GPUs and place one instance of the torchscript model on each GPU using triton config.
I am using triton server. What other information is required?
- The triton server is running on
cuda:0andcuda:1 - Torchscript was complied using
cuda:0
The error is caused by model instance on cuda:1 of the server fails with the error message expected device cuda:1 but got device cuda:0