RuntimeError: Unrecognized CachingAllocator option: 0

Hi ,
I ran my program in ubuntu 22, nvdidia infomation as followings:
NVIDIA-SMI 550.54.14 Driver Version: 550.54.14 CUDA Version: 12.4
torch version :2.2.1+cu121

File “/home/ubuntu/ChatGLM3/openai_api_demo/openai_api.py”, line 476, in
model = AutoModel.from_pretrained(MODEL_PATH, trust_remote_code=True).to(DEVICE).eval()
File “/home/ubuntu/.local/lib/python3.10/site-packages/transformers/modeling_utils.py”, line 2556, in to
return super().to(*args, **kwargs)
File “/home/ubuntu/.local/lib/python3.10/site-packages/torch/nn/modules/module.py”, line 1152, in to
return self._apply(convert)
File “/home/ubuntu/.local/lib/python3.10/site-packages/torch/nn/modules/module.py”, line 802, in _apply
module._apply(fn)
File “/home/ubuntu/.local/lib/python3.10/site-packages/torch/nn/modules/module.py”, line 802, in _apply
module._apply(fn)
File “/home/ubuntu/.local/lib/python3.10/site-packages/torch/nn/modules/module.py”, line 802, in _apply
module._apply(fn)
File “/home/ubuntu/.local/lib/python3.10/site-packages/torch/nn/modules/module.py”, line 825, in _apply
param_applied = fn(param)
File “/home/ubuntu/.local/lib/python3.10/site-packages/torch/nn/modules/module.py”, line 1150, in convert
return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
File “/home/ubuntu/.local/lib/python3.10/site-packages/torch/cuda/init.py”, line 302, in _lazy_init
torch._C._cuda_init()
RuntimeError: Unrecognized CachingAllocator option: 0

Could you post a minimal and executable code snippet reproducing the issue, please?