ik_llama.cpp/543 - dots.llm1 support and thanks.md at eaa2510a28b60d43c2210c69cefdf750d5cc119f - ik_llama.cpp

ikawrakow/ik_llama.cpp

Fork 0

mirror of https://github.com/ikawrakow/ik_llama.cpp.git synced 2026-01-26 09:09:50 +00:00

Files

Thomas eaa2510a28 Add GitHub data: filename sanitization (#640 )

2025-07-23 13:31:53 +02:00

1.7 KiB

Raw Blame History

🗣️ #543 - dots.llm1 support and thanks

Author	`Iconology`
Created	2025-06-20
Updated	2025-07-03

Description

Hey, friend,

Out of curiosity, do you have any plans to add dots.llm1 support? The model seems interesting enough. I tried it out on mainline, but the speeds were atrocious for its size, making it unusable, at least for me. That’s why I jumped over to your fork (thanks to ubergarm) for both the insane MoE speedups and for being the godfather of, arguably, the absolute SOTA quants in my eyes.

Here's the pull request from mainline for dots: 9ae4143bc6

Regardless of whether it’s on your roadmap or not, I just wanted to say thank you, ikawrakow, for all that you have done and continue to do. You are one of a kind.

🗣️ Discussion

👤 saood06 replied the 2025-06-20 at 03:21:14:

The model seems interesting enough.

I agree, from a quick skim of the PR code, I don't see anything that would lead to a complicated port. I could do it if no one else gets to it first.

Especially due to this part in that PR:

The model architecture is a combination of Qwen and Deepseek parts, as seen here:

ffe12627b4/src/transformers/models/dots1/modular_dots1.py

👤 firecoperana replied the 2025-07-02 at 22:56:45:
@saood06 Are you working on it? If not, I can give a try.

👤 saood06 replied the 2025-07-03 at 02:23:35:
#573 exists now. Testing is welcome.

1.7 KiB Raw Blame History Unescape Escape

🗣️ #543 - dots.llm1 support and thanks

Description

🗣️ Discussion

1.7 KiB

Raw Blame History