exllamav3

mirror of https://github.com/turboderp-org/exllamav3.git synced 2026-04-27 01:39:04 +00:00

Author	SHA1	Message	Date
turboderp	1fa6071bc3	Config: Recognize `rope_parameters` and `rope_theta` therein	2025-12-03 18:25:03 +01:00
turboderp	beb23d4095	Loader: Don't break if transposing 0D tensor	2025-12-03 18:24:13 +01:00
turboderp	ba657d399d	chat.py: Add Ministral template	2025-12-03 18:23:34 +01:00
turboderp	88d3814bc5	Merge pull request #113 from yadirhb/patch-1 KeyError: 'cache_id' originating from the function recv_embeddings	2025-11-29 00:05:47 +01:00
Yadir Hernandez Batista	965f70a871	Same treatment to method	2025-11-28 15:26:26 -05:00
Yadir Hernandez Batista	c8f30494f9	Removed None since it's the default.	2025-11-28 15:20:16 -05:00
Yadir Hernandez Batista	f1d838251e	KeyError: 'cache_id' originating from the function recv_embeddings Ran into issues today while testing the new Qwen3-VL-Instruct_5.0bpw_H6: ``` Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.428 INFO: Received chat completion streaming request Nov 28 19:03:58 tabby-api start.sh[34003]: e693c1eef51641df8d64dee63d490091 Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.823 ERROR: FATAL ERROR with generation. Attempting to Nov 28 19:03:58 tabby-api start.sh[34003]: recreate the generator. If this fails, please restart the server. Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.825 WARNING: Immediately terminating all jobs. Clients will Nov 28 19:03:58 tabby-api start.sh[34003]: have their requests cancelled. Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: Traceback (most recent call last): Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: File Nov 28 19:03:58 tabby-api start.sh[34003]: "/opt/tabbyAPI/endpoints/OAI/utils/chat_completion.py", line 373, in Nov 28 19:03:58 tabby-api start.sh[34003]: stream_generate_chat_completion Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: raise generation Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: File Nov 28 19:03:58 tabby-api start.sh[34003]: "/opt/tabbyAPI/endpoints/OAI/utils/completion.py", line 118, in Nov 28 19:03:58 tabby-api start.sh[34003]: _stream_collector Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: async for generation in new_generation: Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: File Nov 28 19:03:58 tabby-api start.sh[34003]: "/opt/tabbyAPI/backends/exllamav3/model.py", line 779, in stream_generate Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: async for generation_chunk in Nov 28 19:03:58 tabby-api start.sh[34003]: self.generate_gen( Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: File Nov 28 19:03:58 tabby-api start.sh[34003]: "/opt/tabbyAPI/backends/exllamav3/model.py", line 1060, in generate_gen Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: raise ex Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: File Nov 28 19:03:58 tabby-api start.sh[34003]: "/opt/tabbyAPI/backends/exllamav3/model.py", line 1002, in generate_gen Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: async for result in job: Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: File Nov 28 19:03:58 tabby-api start.sh[34003]: "/opt/tabbyAPI/venv/lib/python3.12/site-packages/exllamav3/generator/async_gener Nov 28 19:03:58 tabby-api start.sh[34003]: ator.py", line 87, in aiter Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: raise result Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: File Nov 28 19:03:58 tabby-api start.sh[34003]: "/opt/tabbyAPI/venv/lib/python3.12/site-packages/exllamav3/generator/async_gener Nov 28 19:03:58 tabby-api start.sh[34003]: ator.py", line 23, in _run_iteration Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: results = self.generator.iterate() Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: ^^^^^^^^^^^^^^^^^^^^^^^^ Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: File Nov 28 19:03:58 tabby-api start.sh[34003]: "/opt/tabbyAPI/venv/lib/python3.12/site-packages/torch/utils/_contextlib.py", Nov 28 19:03:58 tabby-api start.sh[34003]: line 120, in decorate_context Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: return func(args, kwargs) Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: ^^^^^^^^^^^^^^^^^^^^^ Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: File Nov 28 19:03:58 tabby-api start.sh[34003]: "/opt/tabbyAPI/venv/lib/python3.12/site-packages/exllamav3/generator/generator.p Nov 28 19:03:58 tabby-api start.sh[34003]: y", line 298, in iterate Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: job.prefill(results) Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: File Nov 28 19:03:58 tabby-api start.sh[34003]: "/opt/tabbyAPI/venv/lib/python3.12/site-packages/exllamav3/generator/job.py", Nov 28 19:03:58 tabby-api start.sh[34003]: line 1001, in prefill Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: self.generator.model.prefill( Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: File Nov 28 19:03:58 tabby-api start.sh[34003]: "/opt/tabbyAPI/venv/lib/python3.12/site-packages/torch/utils/_contextlib.py", Nov 28 19:03:58 tabby-api start.sh[34003]: line 120, in decorate_context Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: return func(args, *kwargs) Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: ^^^^^^^^^^^^^^^^^^^^^ Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: File Nov 28 19:03:58 tabby-api start.sh[34003]: "/opt/tabbyAPI/venv/lib/python3.12/site-packages/exllamav3/model/model.py", line Nov 28 19:03:58 tabby-api start.sh[34003]: 101, in prefill Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: return self.prefill_tp(x, params, Nov 28 19:03:58 tabby-api start.sh[34003]: self.last_kv_module_idx, self.modules) Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: Nov 28 19:03:58 tabby-api start.sh[34003]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: File Nov 28 19:03:58 tabby-api start.sh[34003]: "/opt/tabbyAPI/venv/lib/python3.12/site-packages/exllamav3/model/model_tp.py", Nov 28 19:03:58 tabby-api start.sh[34003]: line 379, in prefill_tp Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: self.tp_worker_dispatch(device, Nov 28 19:03:58 tabby-api start.sh[34003]: mp_model_forward, ( Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: File Nov 28 19:03:58 tabby-api start.sh[34003]: "/opt/tabbyAPI/venv/lib/python3.12/site-packages/exllamav3/model/model_tp.py", Nov 28 19:03:58 tabby-api start.sh[34003]: line 171, in tp_worker_dispatch Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: conn.send((fn, args)) Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: File Nov 28 19:03:58 tabby-api start.sh[34003]: "/opt/tabbyAPI/venv/lib/python3.12/site-packages/exllamav3/model/model_tp_fn.py" Nov 28 19:03:58 tabby-api start.sh[34003]: , line 294, in send Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: self.result = fn(self.local_context, Nov 28 19:03:58 tabby-api start.sh[34003]: args) Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: Nov 28 19:03:58 tabby-api start.sh[34003]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: File Nov 28 19:03:58 tabby-api start.sh[34003]: "/opt/tabbyAPI/venv/lib/python3.12/site-packages/exllamav3/model/model_tp_fn.py" Nov 28 19:03:58 tabby-api start.sh[34003]: , line 202, in mp_model_forward Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: params["indexed_embeddings"] = Nov 28 19:03:58 tabby-api start.sh[34003]: recv_embeddings(consumer, p) Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: Nov 28 19:03:58 tabby-api start.sh[34003]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: File Nov 28 19:03:58 tabby-api start.sh[34003]: "/opt/tabbyAPI/venv/lib/python3.12/site-packages/exllamav3/tokenizer/mm_embeddin Nov 28 19:03:58 tabby-api start.sh[34003]: g.py", line 122, in recv_embeddings Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: consumer.recv(dse) for dse in Nov 28 19:03:58 tabby-api start.sh[34003]: imp["deepstack_embeddings"] Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: ^^^^^^^^^^^^^^^^^^ Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: File Nov 28 19:03:58 tabby-api start.sh[34003]: "/opt/tabbyAPI/venv/lib/python3.12/site-packages/exllamav3/model/model_tp_shared Nov 28 19:03:58 tabby-api start.sh[34003]: .py", line 189, in recv Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: cache_id = imp["cache_id"] Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: ~~~^^^^^^^^^^^^ Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.830 ERROR: KeyError: 'cache_id' Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.846 ERROR: Sent to request: Chat completion aborted. Nov 28 19:03:58 tabby-api start.sh[34003]: Please check the server console. Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.890 INFO: 10.0.30.254:61377 - "GET /health HTTP/1.0" 503 Nov 28 19:03:58 tabby-api start.sh[34003]: 2025-11-28 19:03:58.893 INFO: Received chat completion request Nov 28 19:03:58 tabby-api start.sh[34003]: ebf67afe70b74f8f8b34a28746951794 Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.577 ERROR: FATAL ERROR with generation. Attempting to Nov 28 19:04:00 tabby-api start.sh[34003]: recreate the generator. If this fails, please restart the server. Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.578 WARNING: Immediately terminating all jobs. Clients will Nov 28 19:04:00 tabby-api start.sh[34003]: have their requests cancelled. Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.580 INFO: 10.0.30.254:11907 - "GET /health HTTP/1.0" 503 Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: Traceback (most recent call last): Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: File Nov 28 19:04:00 tabby-api start.sh[34003]: "/opt/tabbyAPI/endpoints/OAI/utils/chat_completion.py", line 437, in Nov 28 19:04:00 tabby-api start.sh[34003]: generate_chat_completion Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: generations = await Nov 28 19:04:00 tabby-api start.sh[34003]: asyncio.gather(gen_tasks) Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: Nov 28 19:04:00 tabby-api start.sh[34003]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: File Nov 28 19:04:00 tabby-api start.sh[34003]: "/opt/tabbyAPI/backends/exllamav3/model.py", line 692, in generate Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: async for generation in Nov 28 19:04:00 tabby-api start.sh[34003]: self.stream_generate( Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: File Nov 28 19:04:00 tabby-api start.sh[34003]: "/opt/tabbyAPI/backends/exllamav3/model.py", line 779, in stream_generate Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: async for generation_chunk in Nov 28 19:04:00 tabby-api start.sh[34003]: self.generate_gen( Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: File Nov 28 19:04:00 tabby-api start.sh[34003]: "/opt/tabbyAPI/backends/exllamav3/model.py", line 1060, in generate_gen Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: raise ex Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: File Nov 28 19:04:00 tabby-api start.sh[34003]: "/opt/tabbyAPI/backends/exllamav3/model.py", line 1002, in generate_gen Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: async for result in job: Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: File Nov 28 19:04:00 tabby-api start.sh[34003]: "/opt/tabbyAPI/venv/lib/python3.12/site-packages/exllamav3/generator/async_gener Nov 28 19:04:00 tabby-api start.sh[34003]: ator.py", line 87, in aiter Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: raise result Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: File Nov 28 19:04:00 tabby-api start.sh[34003]: "/opt/tabbyAPI/venv/lib/python3.12/site-packages/exllamav3/generator/async_gener Nov 28 19:04:00 tabby-api start.sh[34003]: ator.py", line 23, in _run_iteration Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: results = self.generator.iterate() Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: ^^^^^^^^^^^^^^^^^^^^^^^^ Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: File Nov 28 19:04:00 tabby-api start.sh[34003]: "/opt/tabbyAPI/venv/lib/python3.12/site-packages/torch/utils/_contextlib.py", Nov 28 19:04:00 tabby-api start.sh[34003]: line 120, in decorate_context Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: return func(args, *kwargs) Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: ^^^^^^^^^^^^^^^^^^^^^ Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: File Nov 28 19:04:00 tabby-api start.sh[34003]: "/opt/tabbyAPI/venv/lib/python3.12/site-packages/exllamav3/generator/generator.p Nov 28 19:04:00 tabby-api start.sh[34003]: y", line 298, in iterate Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: job.prefill(results) Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: File Nov 28 19:04:00 tabby-api start.sh[34003]: "/opt/tabbyAPI/venv/lib/python3.12/site-packages/exllamav3/generator/job.py", Nov 28 19:04:00 tabby-api start.sh[34003]: line 1001, in prefill Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: self.generator.model.prefill( Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: File Nov 28 19:04:00 tabby-api start.sh[34003]: "/opt/tabbyAPI/venv/lib/python3.12/site-packages/torch/utils/_contextlib.py", Nov 28 19:04:00 tabby-api start.sh[34003]: line 120, in decorate_context Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: return func(args, **kwargs) Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: ^^^^^^^^^^^^^^^^^^^^^ Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: File Nov 28 19:04:00 tabby-api start.sh[34003]: "/opt/tabbyAPI/venv/lib/python3.12/site-packages/exllamav3/model/model.py", line Nov 28 19:04:00 tabby-api start.sh[34003]: 101, in prefill Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: return self.prefill_tp(x, params, Nov 28 19:04:00 tabby-api start.sh[34003]: self.last_kv_module_idx, self.modules) Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: Nov 28 19:04:00 tabby-api start.sh[34003]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: File Nov 28 19:04:00 tabby-api start.sh[34003]: "/opt/tabbyAPI/venv/lib/python3.12/site-packages/exllamav3/model/model_tp.py", Nov 28 19:04:00 tabby-api start.sh[34003]: line 386, in prefill_tp Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: r = self.tp_worker_result(device) Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: File Nov 28 19:04:00 tabby-api start.sh[34003]: "/opt/tabbyAPI/venv/lib/python3.12/site-packages/exllamav3/model/model_tp.py", Nov 28 19:04:00 tabby-api start.sh[34003]: line 181, in tp_worker_result Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: raise result Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.582 ERROR: KeyError: 'cache_id' Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.590 ERROR: Sent to request: Chat completion Nov 28 19:04:00 tabby-api start.sh[34003]: ebf67afe70b74f8f8b34a28746951794 aborted. Maybe the model was unloaded? Please Nov 28 19:04:00 tabby-api start.sh[34003]: check the server console. Nov 28 19:04:00 tabby-api start.sh[34003]: 2025-11-28 19:04:00.591 INFO: 10.0.30.254:50633 - "POST /v1/chat/completions Nov 28 19:04:00 tabby-api start.sh[34003]: HTTP/1.1" 503 Nov 28 19:04:01 tabby-api start.sh[34003]: 2025-11-28 19:04:01.592 INFO: 10.0.30.254:35928 - "GET /health HTTP/1.0" 503 ``` After reviewing with Copilot (I don't know this code base), found a small, yet valid change that got tabbyAPI back working without breaking the server again.	2025-11-28 15:14:58 -05:00
turboderp	9e314e6c76	Bump to v0.0.16 v0.0.16	2025-11-25 17:54:31 +01:00
turboderp	85ae1e45b5	Add boilerplate for sharing MM/deepstack embeddings across TP model	2025-11-24 22:23:35 +01:00
turboderp	c9654130a5	Fix TP regression	2025-11-24 00:21:09 +01:00
turboderp	232cc1d8ea	ParallelDecoderBlock: Fix regression	2025-11-16 14:27:56 +01:00
turboderp	ef8fd43d1c	Cleanup unused imports	2025-11-16 14:25:46 +01:00
turboderp	2fc131efc1	Bump to v0.0.15 v0.0.15	2025-11-16 13:51:37 +01:00
turboderp	9b58b45999	Update README.md	2025-11-13 17:23:41 +01:00
turboderp	c47c17bae2	Add Glm4V-MoE architecture	2025-11-13 16:56:29 +01:00
turboderp	78c80485e6	Qwen3VL: Fix type hints	2025-11-13 16:55:11 +01:00
turboderp	1084881f98	Config: Consider token IDs in text_config section	2025-11-13 16:54:32 +01:00
Mamy Ratsimbazafy	ad1aab406f	Add size estimation script for model tensors size	2025-11-13 13:49:39 +01:00
turboderp	08a82f36a3	Add Glm4V architecture	2025-11-13 13:00:19 +01:00
turboderp	7c6b6c473f	chat.py: Allow breaking stream with esc	2025-11-13 12:55:32 +01:00
turboderp	0bd0bfa17d	chat.py: Token dump feature	2025-11-13 12:54:34 +01:00
turboderp	c3886f0841	chat.py: Don't crash on wrong stop conditions	2025-11-13 12:53:23 +01:00
turboderp	1dff8123a6	chat.py: Better SVG extract	2025-11-13 12:52:46 +01:00
turboderp	792bd9dc75	Glm4V: Update examples	2025-11-13 12:52:46 +01:00
turboderp	e3d83c5670	(WIP) Vision tower TP split	2025-11-13 12:48:25 +01:00
turboderp	b937401715	Tokenizer: Fix encoding of unspecial added tokens	2025-11-13 12:46:53 +01:00
turboderp	605428a9cf	RoPE kernel: Allow inv_freq (and M-RoPE) table to work with partial_rotary_factor	2025-11-13 12:46:02 +01:00
turboderp	2a4aac07d8	Bump to v0.0.14 v0.0.14	2025-11-10 01:35:08 +01:00
turboderp	fcce0e6985	Fix example Mistral template	2025-11-10 01:34:26 +01:00
turboderp	abd8ddccbf	Conv: Fix regression for 2D input	2025-11-10 01:32:49 +01:00
turboderp	98e1c4017c	Update README.md v0.0.13	2025-11-09 23:02:08 +01:00
turboderp	46c616a5f9	Bump to v0.0.13	2025-11-09 22:59:29 +01:00
turboderp	38ddd8b9c5	MMLU: Fix prompt	2025-11-09 22:25:53 +01:00
turboderp	1a298a8161	Generator: Fix job enqueue when max_new_tokens == 1	2025-11-09 21:41:42 +01:00
turboderp	03f3d9b861	Unpack SimpleNamespace dict for older Python versions	2025-11-09 15:46:58 +01:00
turboderp	8190932910	Update multimodal example	2025-11-09 13:34:11 +01:00
turboderp	3562dbe7b0	compare_q.py: Work around AutoAWQ being broken in later versions of Transformers	2025-11-09 13:33:40 +01:00
turboderp	ff89563719	Qwen3VL: Fix typo	2025-11-09 03:09:32 +01:00
turboderp	1739b92d50	DeepstackEmbed: Fix typo	2025-11-09 02:53:06 +01:00
turboderp	5d64b322f9	Add Qwen3VLMoe	2025-11-09 02:50:08 +01:00
turboderp	6bc327afda	Allow loading fused expert tensors (new Qwen3 MoE format)	2025-11-09 02:49:18 +01:00
turboderp	20cbe13cab	Update multimodal example	2025-11-09 01:42:30 +01:00
turboderp	da3ea0013c	Add Qwen3VL	2025-11-09 01:41:43 +01:00
turboderp	663f3694d8	Linear: Account for bias when splitting fused tensor	2025-11-09 01:41:43 +01:00
turboderp	261621ce0a	GatedDeltaNet: Add module name	2025-11-09 01:41:43 +01:00
turboderp	7d9ad806e6	RoPE: Add M-RoPE facilities	2025-11-09 01:41:43 +01:00
turboderp	fab5da05ce	chat.py: Better SVG extraction	2025-11-02 23:54:22 +01:00
turboderp	3533b307c3	Update README.md	2025-11-01 19:42:18 +01:00
turboderp	0c0139d9cf	Update docs	2025-11-01 19:40:58 +01:00
turboderp	e384d39e98	Bump to v0.0.12 v0.0.12	2025-11-01 18:21:50 +01:00

1 2 3 4 5 ...

722 Commits