I’ve been using libtorch to run my pytorch model in a single thread.
The process is:
- module = jit::load(mode)
- out_tensor = module.forward()
In the online environment, we use multiple threads to handle requests. Like in tf, the process is:
- model = load_graph() in parent thread
- session = create_session(model), create_session in each child threads. These child threads share the weight of the model
- run_session in each child theads.
How do I implement this structure in libtorch?
Please help me! Thanks.