Torchserve with websocket

Dear Community,

I’m trying to set up a image-segmentation proof-of-concept to deploy my neural network to the cloud using torchserve.

What is the issue?

When connecting with a >requests< - based python client to the torchserve server, it appears that the only way to get a response is via a REST-API. That returns a json that contains the data as a list.

This is my client code for a put request on a running torchserve-server:

headers = {'Content-type': 'image/png', 'Slug': "your_file.png"}
r = requests.put(url, data=open(path, 'rb'), headers=headers)

>>>[ [ 0.0,  -28.528526306152344,  ...long_list] ]

The client receives the segmented image as a json. Both the http and the json serialization cause overhead. Is it possible to send the data bidirectional using e.g. a websocket connection?


Hi Dennis not as of now, but could you elaborate a bit more on the benefits of a web socket approach? Generally I’d expect the bottleneck to be the model inference and not the json and http serialization for a serving framework