Yadir Hernandez Batista
|
f1d838251e
|
KeyError: 'cache_id' originating from the function recv_embeddings
Ran into issues today while testing the new Qwen3-VL-Instruct_5.0bpw_H6:
```
Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.428 INFO: Received chat completion streaming request
Nov 28 19:03:58 tabby-api start.sh[34003]: e693c1eef51641df8d64dee63d490091
Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.823 ERROR: FATAL ERROR with generation. Attempting to
Nov 28 19:03:58 tabby-api start.sh[34003]: recreate the generator. If this fails, please restart the server.
Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.825 WARNING: Immediately terminating all jobs. Clients will
Nov 28 19:03:58 tabby-api start.sh[34003]: have their requests cancelled.
Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: Traceback (most recent call last):
Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: File
Nov 28 19:03:58 tabby-api start.sh[34003]: "/opt/tabbyAPI/endpoints/OAI/utils/chat_completion.py", line 373, in
Nov 28 19:03:58 tabby-api start.sh[34003]: stream_generate_chat_completion
Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: raise generation
Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: File
Nov 28 19:03:58 tabby-api start.sh[34003]: "/opt/tabbyAPI/endpoints/OAI/utils/completion.py", line 118, in
Nov 28 19:03:58 tabby-api start.sh[34003]: _stream_collector
Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: async for generation in new_generation:
Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: File
Nov 28 19:03:58 tabby-api start.sh[34003]: "/opt/tabbyAPI/backends/exllamav3/model.py", line 779, in stream_generate
Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: async for generation_chunk in
Nov 28 19:03:58 tabby-api start.sh[34003]: self.generate_gen(
Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: File
Nov 28 19:03:58 tabby-api start.sh[34003]: "/opt/tabbyAPI/backends/exllamav3/model.py", line 1060, in generate_gen
Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: raise ex
Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: File
Nov 28 19:03:58 tabby-api start.sh[34003]: "/opt/tabbyAPI/backends/exllamav3/model.py", line 1002, in generate_gen
Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: async for result in job:
Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: File
Nov 28 19:03:58 tabby-api start.sh[34003]: "/opt/tabbyAPI/venv/lib/python3.12/site-packages/exllamav3/generator/async_gener
Nov 28 19:03:58 tabby-api start.sh[34003]: ator.py", line 87, in aiter
Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: raise result
Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: File
Nov 28 19:03:58 tabby-api start.sh[34003]: "/opt/tabbyAPI/venv/lib/python3.12/site-packages/exllamav3/generator/async_gener
Nov 28 19:03:58 tabby-api start.sh[34003]: ator.py", line 23, in _run_iteration
Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: results = self.generator.iterate()
Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: ^^^^^^^^^^^^^^^^^^^^^^^^
Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: File
Nov 28 19:03:58 tabby-api start.sh[34003]: "/opt/tabbyAPI/venv/lib/python3.12/site-packages/torch/utils/_contextlib.py",
Nov 28 19:03:58 tabby-api start.sh[34003]: line 120, in decorate_context
Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: return func(*args, **kwargs)
Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: ^^^^^^^^^^^^^^^^^^^^^
Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: File
Nov 28 19:03:58 tabby-api start.sh[34003]: "/opt/tabbyAPI/venv/lib/python3.12/site-packages/exllamav3/generator/generator.p
Nov 28 19:03:58 tabby-api start.sh[34003]: y", line 298, in iterate
Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: job.prefill(results)
Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: File
Nov 28 19:03:58 tabby-api start.sh[34003]: "/opt/tabbyAPI/venv/lib/python3.12/site-packages/exllamav3/generator/job.py",
Nov 28 19:03:58 tabby-api start.sh[34003]: line 1001, in prefill
Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: self.generator.model.prefill(
Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: File
Nov 28 19:03:58 tabby-api start.sh[34003]: "/opt/tabbyAPI/venv/lib/python3.12/site-packages/torch/utils/_contextlib.py",
Nov 28 19:03:58 tabby-api start.sh[34003]: line 120, in decorate_context
Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: return func(*args, **kwargs)
Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: ^^^^^^^^^^^^^^^^^^^^^
Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: File
Nov 28 19:03:58 tabby-api start.sh[34003]: "/opt/tabbyAPI/venv/lib/python3.12/site-packages/exllamav3/model/model.py", line
Nov 28 19:03:58 tabby-api start.sh[34003]: 101, in prefill
Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: return self.prefill_tp(x, params,
Nov 28 19:03:58 tabby-api start.sh[34003]: self.last_kv_module_idx, self.modules)
Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR:
Nov 28 19:03:58 tabby-api start.sh[34003]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: File
Nov 28 19:03:58 tabby-api start.sh[34003]: "/opt/tabbyAPI/venv/lib/python3.12/site-packages/exllamav3/model/model_tp.py",
Nov 28 19:03:58 tabby-api start.sh[34003]: line 379, in prefill_tp
Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: self.tp_worker_dispatch(device,
Nov 28 19:03:58 tabby-api start.sh[34003]: mp_model_forward, (
Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: File
Nov 28 19:03:58 tabby-api start.sh[34003]: "/opt/tabbyAPI/venv/lib/python3.12/site-packages/exllamav3/model/model_tp.py",
Nov 28 19:03:58 tabby-api start.sh[34003]: line 171, in tp_worker_dispatch
Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: conn.send((fn, args))
Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: File
Nov 28 19:03:58 tabby-api start.sh[34003]: "/opt/tabbyAPI/venv/lib/python3.12/site-packages/exllamav3/model/model_tp_fn.py"
Nov 28 19:03:58 tabby-api start.sh[34003]: , line 294, in send
Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: self.result = fn(self.local_context,
Nov 28 19:03:58 tabby-api start.sh[34003]: *args)
Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR:
Nov 28 19:03:58 tabby-api start.sh[34003]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: File
Nov 28 19:03:58 tabby-api start.sh[34003]: "/opt/tabbyAPI/venv/lib/python3.12/site-packages/exllamav3/model/model_tp_fn.py"
Nov 28 19:03:58 tabby-api start.sh[34003]: , line 202, in mp_model_forward
Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: params["indexed_embeddings"] =
Nov 28 19:03:58 tabby-api start.sh[34003]: recv_embeddings(consumer, p)
Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR:
Nov 28 19:03:58 tabby-api start.sh[34003]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: File
Nov 28 19:03:58 tabby-api start.sh[34003]: "/opt/tabbyAPI/venv/lib/python3.12/site-packages/exllamav3/tokenizer/mm_embeddin
Nov 28 19:03:58 tabby-api start.sh[34003]: g.py", line 122, in recv_embeddings
Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: consumer.recv(dse) for dse in
Nov 28 19:03:58 tabby-api start.sh[34003]: imp["deepstack_embeddings"]
Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: ^^^^^^^^^^^^^^^^^^
Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: File
Nov 28 19:03:58 tabby-api start.sh[34003]: "/opt/tabbyAPI/venv/lib/python3.12/site-packages/exllamav3/model/model_tp_shared
Nov 28 19:03:58 tabby-api start.sh[34003]: .py", line 189, in recv
Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: cache_id = imp["cache_id"]
Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: ~~~^^^^^^^^^^^^
Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: KeyError: 'cache_id'
Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.846 ERROR: Sent to request: Chat completion aborted.
Nov 28 19:03:58 tabby-api start.sh[34003]: Please check the server console.
Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.890 INFO: 10.0.30.254:61377 - "GET /health HTTP/1.0" 503
Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.893 INFO: Received chat completion request
Nov 28 19:03:58 tabby-api start.sh[34003]: ebf67afe70b74f8f8b34a28746951794
Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.577 ERROR: FATAL ERROR with generation. Attempting to
Nov 28 19:04:00 tabby-api start.sh[34003]: recreate the generator. If this fails, please restart the server.
Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.578 WARNING: Immediately terminating all jobs. Clients will
Nov 28 19:04:00 tabby-api start.sh[34003]: have their requests cancelled.
Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.580 INFO: 10.0.30.254:11907 - "GET /health HTTP/1.0" 503
Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: Traceback (most recent call last):
Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: File
Nov 28 19:04:00 tabby-api start.sh[34003]: "/opt/tabbyAPI/endpoints/OAI/utils/chat_completion.py", line 437, in
Nov 28 19:04:00 tabby-api start.sh[34003]: generate_chat_completion
Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: generations = await
Nov 28 19:04:00 tabby-api start.sh[34003]: asyncio.gather(*gen_tasks)
Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR:
Nov 28 19:04:00 tabby-api start.sh[34003]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: File
Nov 28 19:04:00 tabby-api start.sh[34003]: "/opt/tabbyAPI/backends/exllamav3/model.py", line 692, in generate
Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: async for generation in
Nov 28 19:04:00 tabby-api start.sh[34003]: self.stream_generate(
Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: File
Nov 28 19:04:00 tabby-api start.sh[34003]: "/opt/tabbyAPI/backends/exllamav3/model.py", line 779, in stream_generate
Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: async for generation_chunk in
Nov 28 19:04:00 tabby-api start.sh[34003]: self.generate_gen(
Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: File
Nov 28 19:04:00 tabby-api start.sh[34003]: "/opt/tabbyAPI/backends/exllamav3/model.py", line 1060, in generate_gen
Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: raise ex
Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: File
Nov 28 19:04:00 tabby-api start.sh[34003]: "/opt/tabbyAPI/backends/exllamav3/model.py", line 1002, in generate_gen
Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: async for result in job:
Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: File
Nov 28 19:04:00 tabby-api start.sh[34003]: "/opt/tabbyAPI/venv/lib/python3.12/site-packages/exllamav3/generator/async_gener
Nov 28 19:04:00 tabby-api start.sh[34003]: ator.py", line 87, in aiter
Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: raise result
Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: File
Nov 28 19:04:00 tabby-api start.sh[34003]: "/opt/tabbyAPI/venv/lib/python3.12/site-packages/exllamav3/generator/async_gener
Nov 28 19:04:00 tabby-api start.sh[34003]: ator.py", line 23, in _run_iteration
Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: results = self.generator.iterate()
Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: ^^^^^^^^^^^^^^^^^^^^^^^^
Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: File
Nov 28 19:04:00 tabby-api start.sh[34003]: "/opt/tabbyAPI/venv/lib/python3.12/site-packages/torch/utils/_contextlib.py",
Nov 28 19:04:00 tabby-api start.sh[34003]: line 120, in decorate_context
Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: return func(*args, **kwargs)
Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: ^^^^^^^^^^^^^^^^^^^^^
Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: File
Nov 28 19:04:00 tabby-api start.sh[34003]: "/opt/tabbyAPI/venv/lib/python3.12/site-packages/exllamav3/generator/generator.p
Nov 28 19:04:00 tabby-api start.sh[34003]: y", line 298, in iterate
Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: job.prefill(results)
Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: File
Nov 28 19:04:00 tabby-api start.sh[34003]: "/opt/tabbyAPI/venv/lib/python3.12/site-packages/exllamav3/generator/job.py",
Nov 28 19:04:00 tabby-api start.sh[34003]: line 1001, in prefill
Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: self.generator.model.prefill(
Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: File
Nov 28 19:04:00 tabby-api start.sh[34003]: "/opt/tabbyAPI/venv/lib/python3.12/site-packages/torch/utils/_contextlib.py",
Nov 28 19:04:00 tabby-api start.sh[34003]: line 120, in decorate_context
Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: return func(*args, **kwargs)
Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: ^^^^^^^^^^^^^^^^^^^^^
Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: File
Nov 28 19:04:00 tabby-api start.sh[34003]: "/opt/tabbyAPI/venv/lib/python3.12/site-packages/exllamav3/model/model.py", line
Nov 28 19:04:00 tabby-api start.sh[34003]: 101, in prefill
Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: return self.prefill_tp(x, params,
Nov 28 19:04:00 tabby-api start.sh[34003]: self.last_kv_module_idx, self.modules)
Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR:
Nov 28 19:04:00 tabby-api start.sh[34003]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: File
Nov 28 19:04:00 tabby-api start.sh[34003]: "/opt/tabbyAPI/venv/lib/python3.12/site-packages/exllamav3/model/model_tp.py",
Nov 28 19:04:00 tabby-api start.sh[34003]: line 386, in prefill_tp
Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: r = self.tp_worker_result(device)
Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: File
Nov 28 19:04:00 tabby-api start.sh[34003]: "/opt/tabbyAPI/venv/lib/python3.12/site-packages/exllamav3/model/model_tp.py",
Nov 28 19:04:00 tabby-api start.sh[34003]: line 181, in tp_worker_result
Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: raise result
Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: KeyError: 'cache_id'
Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.590 ERROR: Sent to request: Chat completion
Nov 28 19:04:00 tabby-api start.sh[34003]: ebf67afe70b74f8f8b34a28746951794 aborted. Maybe the model was unloaded? Please
Nov 28 19:04:00 tabby-api start.sh[34003]: check the server console.
Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.591 INFO: 10.0.30.254:50633 - "POST /v1/chat/completions
Nov 28 19:04:00 tabby-api start.sh[34003]: HTTP/1.1" 503
Nov 28 19:04:01 tabby-api start.sh[34003]: 2025-11-28 19:04:01.592 INFO: 10.0.30.254:35928 - "GET /health HTTP/1.0" 503
```
After reviewing with Copilot (I don't know this code base), found a small, yet valid change that got tabbyAPI back working without breaking the server again.
|
2025-11-28 15:14:58 -05:00 |
|