How is this possible?

Can someone please share how to debug this?

  • created a mar file, started torch, provided configs, etc.
  • for what it’s worth, I am also seeing: Model config: N/A in logs/ts_log.log
WARN W-9000-m_1.0-stderr MODEL_LOG - Traceback (most recent call last):
WARN W-9000-m_1.0-stderr MODEL_LOG -   File "/opt/homebrew/lib/python3.11/site-packages/ts/model_service_worker.py", line 16, in <module>
WARN W-9000-m_1.0-stderr MODEL_LOG -     from ts.model_loader import ModelLoaderFactory
WARN W-9000-m_1.0-stderr MODEL_LOG -   File "/opt/homebrew/lib/python3.11/site-packages/ts/model_loader.py", line 13, in <module>
WARN W-9000-m_1.0-stderr MODEL_LOG -     from ts.service import Service
WARN W-9000-m_1.0-stderr MODEL_LOG -   File "/opt/homebrew/lib/python3.11/site-packages/ts/service.py", line 11, in <module>
WARN W-9000-m_1.0-stderr MODEL_LOG -     from ts.protocol.otf_message_handler import create_predict_response
WARN W-9000-m_1.0-stderr MODEL_LOG -   File "/opt/homebrew/lib/python3.11/site-packages/ts/protocol/otf_message_handler.py", line 13, in <module>
WARN W-9000-m_1.0-stderr MODEL_LOG -     import torch
WARN W-9000-m_1.0-stderr MODEL_LOG - ModuleNotFoundError: No module named 'torch'

Could you please share your config.properties?

Are you using the interpreter you think you’re using? It looks like you’re using python 3.11 installed via homebrew, and based on your error I would assume that’s not where you installed torch.

You can run pip freeze | grep torch to confirm.

I updated my torch through pip and the above error I can’t see that problem anymore.

However, now it is throwint the same type of problem for numpy & ‘ts.torch_handler.custom_handler’

Here is my config.properties, nothing out of the ordinary:

# The address for inference API calls
inference_address=http://0.0.0.0:8080

# The address for management API calls
management_address=http://0.0.0.0:8081

# The address for metrics API calls
metrics_address=http://0.0.0.0:8082

# Number of worker threads for each model
default_workers_per_model=1

# Path to the directory containing .mar files
model_store=./model_store/

And here is the config.json which I pass in for the model:

{
  "model": {
    "modelName": "model_v1",
    "serializedFile": "model_v1.mar",
    "handler": "my_handler",
    "modelFile": "my_net.py",
    "modelVersion": "1.0"
  },
  "runtime": "python3",
  "minWorkers": 1,
  "maxWorkers": 5,
  "batchSize": 1,
  "maxBatchDelay": 100,
  "responseTimeout": 120
}

Then I am also seeing in the logs:

2023-06-16T16:59:00,986 [INFO ] main org.pytorch.serve.wlm.ModelManager - Model model_v1 loaded.
2023-06-16T16:59:00,986 [INFO ] main org.pytorch.serve.wlm.ModelManager - Model model_v1 loaded.
2023-06-16T16:59:00,986 [DEBUG] main org.pytorch.serve.wlm.ModelManager - updateModel: model_v1, count: 1
2023-06-16T16:59:00,986 [DEBUG] main org.pytorch.serve.wlm.ModelManager - updateModel: model_v1, count: 1

[INFO ] -stdout MODEL_LOG - Torch worker started.
[INFO ] -stdout MODEL_LOG - Python runtime: 3.11.3
[DEBUG]  org.pytorch.serve.wlm.WorkerThread -  State change null -> WORKER_STARTED
[DEBUG]  org.pytorch.serve.wlm.WorkerThread -  State change null -> WORKER_STARTED
[INFO ]  org.pytorch.serve.wlm.WorkerThread - Connecting to: /127.0.0.1:9000
[INFO ]  org.pytorch.serve.wlm.WorkerThread - Connecting to: /127.0.0.1:9000
[INFO ] -stdout MODEL_LOG - Connection accepted: ('127.0.0.1', 9000).
[INFO ]  org.pytorch.serve.wlm.WorkerThread - Flushing req.cmd LOAD to backend at: 1686949148239
[INFO ]  org.pytorch.serve.wlm.WorkerThread - Flushing req.cmd LOAD to backend at: 1686949148239
[INFO ] -stdout MODEL_LOG - model_name: model_v1, batchSize: 1
[INFO ] -stdout MODEL_LOG - Backend worker process died.
[INFO ] -stdout MODEL_LOG - Traceback (most recent call last):
[INFO ] -stdout MODEL_LOG -   File "/opt/homebrew/lib/python3.11/site-packages/ts/model_loader.py", line 100, in load
[INFO ] -stdout MODEL_LOG -     module, function_name = self._load_handler_file(handler)
[INFO ] -stdout MODEL_LOG -                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[INFO ] -stdout MODEL_LOG -   File "/opt/homebrew/lib/python3.11/site-packages/ts/model_loader.py", line 145, in _load_handler_file
[INFO ] -stdout MODEL_LOG -     module = importlib.import_module(module_name)
[INFO ] -stdout MODEL_LOG -              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[INFO ] -stdout MODEL_LOG -   File "/opt/homebrew/Cellar/python@3.11/3.11.3/Frameworks/Python.framework/Versions/3.11/lib/python3.11/importlib/__init__.py", line 126, in import_module
[INFO ] -stdout MODEL_LOG -     return _bootstrap._gcd_import(name[level:], package, level)
[INFO ] -stdout MODEL_LOG -            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[INFO ] -stdout MODEL_LOG -   File "<frozen importlib._bootstrap>", line 1206, in _gcd_import
[INFO ] -stdout MODEL_LOG -   File "<frozen importlib._bootstrap>", line 1178, in _find_and_load
[INFO ] -stdout MODEL_LOG -   File "<frozen importlib._bootstrap>", line 1149, in _find_and_load_unlocked
[INFO ] -stdout MODEL_LOG -   File "<frozen importlib._bootstrap>", line 690, in _load_unlocked
[INFO ] -stdout MODEL_LOG -   File "<frozen importlib._bootstrap_external>", line 940, in exec_module
[INFO ] -stdout MODEL_LOG -   File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
[INFO ] -stdout MODEL_LOG -   File "/private/var/folders/99/75hxn5zs4vd_fbmfsj5nky2c0000gp/T/models/a1707d56fc184f26a4a9ed4619fe0863/custom_handler.py", line 12, in <module>
[INFO ] -stdout MODEL_LOG -     import numpy as np
[INFO ] -stdout MODEL_LOG - ModuleNotFoundError: No module named 'numpy'
[INFO ] nioEventLoopGroup-5-1 org.pytorch.serve.wlm.WorkerThread - 9000 Worker disconnected. WORKER_STARTED
[INFO ] -stdout MODEL_LOG - 
[INFO ] nioEventLoopGroup-5-1 org.pytorch.serve.wlm.WorkerThread - 9000 Worker disconnected. WORKER_STARTED
[INFO ] -stdout MODEL_LOG - During handling of the above exception, another exception occurred:
[INFO ] -stdout MODEL_LOG - 
[INFO ] -stdout MODEL_LOG - Traceback (most recent call last):
[DEBUG]  org.pytorch.serve.wlm.WorkerThread - System state is : WORKER_STARTED
[DEBUG]  org.pytorch.serve.wlm.WorkerThread - System state is : WORKER_STARTED
[INFO ] -stdout MODEL_LOG -   File "/opt/homebrew/lib/python3.11/site-packages/ts/model_service_worker.py", line 253, in <module>
[INFO ] -stdout MODEL_LOG -     worker.run_server()
[INFO ] -stdout MODEL_LOG -   File "/opt/homebrew/lib/python3.11/site-packages/ts/model_service_worker.py", line 221, in run_server
[INFO ] -stdout MODEL_LOG -     self.handle_connection(cl_socket)
[INFO ] -stdout MODEL_LOG -   File "/opt/homebrew/lib/python3.11/site-packages/ts/model_service_worker.py", line 184, in handle_connection
[INFO ] -stdout MODEL_LOG -     service, result, code = self.load_model(msg)
[INFO ] -stdout MODEL_LOG -                             ^^^^^^^^^^^^^^^^^^^^
[INFO ] -stdout MODEL_LOG -   File "/opt/homebrew/lib/python3.11/site-packages/ts/model_service_worker.py", line 131, in load_model
[INFO ] -stdout MODEL_LOG -     service = model_loader.load(
[INFO ] -stdout MODEL_LOG -               ^^^^^^^^^^^^^^^^^^
[INFO ] -stdout MODEL_LOG -   File "/opt/homebrew/lib/python3.11/site-packages/ts/model_loader.py", line 102, in load
[INFO ] -stdout MODEL_LOG -     module = self._load_default_handler(handler)
[INFO ] -stdout MODEL_LOG -              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[INFO ] -stdout MODEL_LOG -   File "/opt/homebrew/lib/python3.11/site-packages/ts/model_loader.py", line 151, in _load_default_handler
[INFO ] -stdout MODEL_LOG -     module = importlib.import_module(module_name, "ts.torch_handler")
[INFO ] -stdout MODEL_LOG -              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[INFO ] -stdout MODEL_LOG -   File "/opt/homebrew/Cellar/python@3.11/3.11.3/Frameworks/Python.framework/Versions/3.11/lib/python3.11/importlib/__init__.py", line 126, in import_module
[INFO ] -stdout MODEL_LOG -     return _bootstrap._gcd_import(name[level:], package, level)
[INFO ] -stdout MODEL_LOG -            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[INFO ] -stdout MODEL_LOG -   File "<frozen importlib._bootstrap>", line 1206, in _gcd_import
[INFO ] -stdout MODEL_LOG -   File "<frozen importlib._bootstrap>", line 1178, in _find_and_load
[INFO ] -stdout MODEL_LOG -   File "<frozen importlib._bootstrap>", line 1128, in _find_and_load_unlocked
[INFO ] -stdout MODEL_LOG -   File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
[INFO ] -stdout MODEL_LOG -   File "<frozen importlib._bootstrap>", line 1206, in _gcd_import
[INFO ] -stdout MODEL_LOG -   File "<frozen importlib._bootstrap>", line 1178, in _find_and_load
[INFO ] -stdout MODEL_LOG -   File "<frozen importlib._bootstrap>", line 1142, in _find_and_load_unlocked
[INFO ] -stdout MODEL_LOG - ModuleNotFoundError: No module named 'ts.torch_handler.custom_handler'
[DEBUG]  org.pytorch.serve.wlm.WorkerThread - Backend worker monitoring thread interrupted or backend worker process died.
java.lang.InterruptedException: null
  at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:1765) ~[?:?]
  at java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:435) ~[?:?]
  at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:213) ~[model-server.jar:?]
  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:577) ~[?:?]
  at java.util.concurrent.FutureTask.run(FutureTask.java:317) ~[?:?]
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) ~[?:?]
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) ~[?:?]
  at java.lang.Thread.run(Thread.java:1623) [?:?]
[DEBUG]  org.pytorch.serve.wlm.WorkerThread - Backend worker monitoring thread interrupted or backend worker process died.
java.lang.InterruptedException: null
  at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:1765) ~[?:?]
  at java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:435) ~[?:?]
  at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:213) ~[model-server.jar:?]
  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:577) ~[?:?]
  at java.util.concurrent.FutureTask.run(FutureTask.java:317) ~[?:?]
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) ~[?:?]
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) ~[?:?]
  at java.lang.Thread.run(Thread.java:1623) [?:?]
[WARN ]  org.pytorch.serve.wlm.BatchAggregator - Load model failed: model_v1, error: Worker died.
[WARN ]  org.pytorch.serve.wlm.BatchAggregator - Load model failed: model_v1, error: Worker died.

It seems that for some reason when it is going through the custom handler imports it cannot find numpy & from ts.torch_handler.base_handler import BaseHandler.

Ideas ?

@ringohoffman How would I resolve this? I am able to import those packages from PyCharm, but the Server seems to not get them.