I have a traced model which i use in C++.
The documentation clearly states: “Automatic parallelization of models onto multiple GPUs like torch.nn.parallel.DataParallel”.
However I do not see this happening on a multi-gpu device. How do i explicitly force the
auto module = torch::jit::load(s_model_name, torch::kCUDA); module to be defined like it would be in pure PyTorch?