Torchserve spawning 30k+ threads but never closing them

Hey everyone,

We’re currently looking into deploying our first torchserve service. As some simple tests, we’ve loaded it up with two workflows, did some load testing with ab and we’re seeing the torchserve process creating lots and lots of threads (we’re talking 30k+, about 3 per request), never cleaning them up. Eventually it leads to pthread errors and broken requests.

We’re super new to this - any tips on what to look for, whic configuration parameters could influence this and so on would be super apprecianted. I can share parts of the configuration, but please let me know which parameters are of most importance.

Bonus info:

  • torchserve 0.4.2 and 0.4.0 tested, same behavior
  • using the REST interface for now for inference
  • adjusted netty threads as well as job queue size and even relatively small values (2 and 100) don’t seem to limit the number of threads spawned.

Thanks!

1 Like