Is Pytorch have any asynchronous inference API?

It does not out of the box.
However, JITed models release the GIL for their processing, so launching the model in a background thread and waiting for completion is quite efficient.

We do this in Chapter 15 of our book. While the free download on the PyTorch.org website has ended, the example code is available on github: dlwpt-code/request_batching_jit_server.py at master · deep-learning-with-pytorch/dlwpt-code · GitHub

Best regards

Thomas