The ping API return Healthy for most of the time, and I assume after some time it hits a timeout and starts saying unhealthy.
I do see these errors popping up on the logs, I am not sure where they are coming from:
W-9009-my_tc_1.0-stdout MODEL_LOG - Backend worker process died.
my_tc_1.0-stdout MODEL_LOG - Traceback (most recent call last):
my_tc_1.0-stdout MODEL_LOG - File ".../site-packages/ts/model_service_worker.py", line 250, in <module>
my_tc_1.0-stdout MODEL_LOG - worker = TorchModelServiceWorker(
my_tc_1.0-stdout MODEL_LOG - File ".../site-packages/ts/model_service_worker.py", line 69, in __init__
my_tc_1.0-stdout MODEL_LOG - self.port = str(port_num) + LOCAL_RANK
my_tc_1.0-stdout MODEL_LOG - Traceback (most recent call last):
my_tc_1.0-stdout MODEL_LOG - TypeError: can only concatenate str (not "int") to str
How can I provide the Context file via the command line?
I suppose my questions are:
When I do a curl http://localhost:8080/predictions/my_tc/1.0 -T input_file.txt What method is called? Which method in the Custom Handler or Base Handler is receiving the input file?
What should the input file look like? I think somewhere it was mentioned a Json?
Any examples which create the Context file and then pass it for serving?
So in this case this issue is not your fault but if you’re trying to debug a model where the issue is probably yours the most useful to inspect logs/model_log.log
@marksaroufim - Thanks for the reply. Any ideas wrt these questions:
When I do a curl http://localhost:8080/predictions/my_tc/1.0 -T input_file.txt What method is called?
Which method in the Custom Handler or Base Handler is receiving the input file?
What should the input file look like? I think somewhere it was mentioned a Json?
Any examples which create the Context file and then pass it for serving?
If I don’t see anything showing up in that log, what could be the cause?
If I do a curl request while running “tail -f logs/*.log” I only see the following for my custom model. For the mnist example I see entries coming on for all the expected logs.
It suddenly started to work. I will try to figure out what I did to get it to work.
My custom handler is now called. Only issue is that I am returning a python list but I am always only getting the first element returned. I am using JSONEnvelope, so I am now looking at what is causing this issue. I assume it is related to the batching feature.
OK this to work … but now I am getting “Model "EnglishContext" has no worker to serve inference request. Please use scale workers API to add workers.” but I have no clue why this suddenly is happening.
I now added “default_workers_per_model=1” but it is still the case.
Alright things are working now. I think the root cause was model dependency related (and server running out of disk space). However I do not have a reproduce-able test case.
@marksaroufim does it still make sense to open a ticket?