[Libtorch] How to train using multiple gpu?

I used libtorch to create model in c++ environment and train in a single gpu.

Now, I want to train using multi gpu, but I don’t know how.

Do you have any examples related to this?

I haven’t used the C++ dataparallel API yet, but you might want to take a look at this test.
I’m unsure about the status of DDP in libtorch, which is the recommended approach for performance reasons.

Does torch::nn::parallel::data_parallel() support torch::jit::Module?

I think data_parallel should work with a scripted model, as it would only chunk the inputs and transfer them to all specified GPUs as well as copying the model to these devices, as long as the eager model also runs fine in data parallel (i.e. no device mismatches are raised due to a wrong usage of a specific device inside the model).