We’re currently looking into deploying our first torchserve service. As some simple tests, we’ve loaded it up with two workflows, did some load testing with
ab and we’re seeing the
torchserve process creating lots and lots of threads (we’re talking 30k+, about 3 per request), never cleaning them up. Eventually it leads to
pthread errors and broken requests.
We’re super new to this - any tips on what to look for, whic configuration parameters could influence this and so on would be super apprecianted. I can share parts of the configuration, but please let me know which parameters are of most importance.
- torchserve 0.4.2 and 0.4.0 tested, same behavior
- using the REST interface for now for inference
- adjusted netty threads as well as job queue size and even relatively small values (2 and 100) don’t seem to limit the number of threads spawned.