I am trying to run a pytorch model for inference in a multithreaded scenario where multiple threads try to predict using the shared model.
In a non-concurrent case, inference using the model runs perfectly fine but I start getting the following error occasionally for different threads
RuntimeError: The expanded size of the tensor (100) must match the existing size (3) at non-singleton dimension 1. Target sizes: [40, 100, 300]. Tensor sizes: [16, 3, 1]
I have experimented a little bit and having a mutual exclusion on the prediction solves the problem but that defeats the purpose of having multiple threads.
I have some questions
- What is the recommended way to have concurrent threads sharing the same model?
- Do I need to change any configuration related settings?