mirror of
https://github.com/ikawrakow/ik_llama.cpp.git
synced 2026-03-03 18:40:14 +00:00
37 lines
1.0 KiB
Markdown
37 lines
1.0 KiB
Markdown
### 📝 [#452](https://github.com/ikawrakow/ik_llama.cpp/issues/452) - Falcon H1 Support
|
|
|
|
| **Author** | `Downtown-Case` |
|
|
| :--- | :--- |
|
|
| **State** | ❌ **Closed** |
|
|
| **Created** | 2025-05-23 |
|
|
| **Updated** | 2025-06-27 |
|
|
|
|
---
|
|
|
|
#### Description
|
|
|
|
A hybrid transformers/mamba2 series with good performance: https://huggingface.co/collections/tiiuae/falcon-h1-6819f2795bc406da60fab8df
|
|
|
|
Officially supported via their fork of llama.cpp here: https://github.com/tiiuae/llama.cpp-Falcon-H1
|
|
|
|
Support for ik_llama.cpp's tighter quantization schemes would be nice :). Maybe something in this fork can shrink the Mamba2 context cache as well?
|
|
|
|
---
|
|
|
|
#### 💬 Conversation
|
|
|
|
👤 **ikawrakow** commented the **2025-05-24** at **07:04:24**:<br>
|
|
|
|
Have you though about adding a feature request to the llama.cpp-Falcon-H1 authors?
|
|
|
|
---
|
|
|
|
👤 **Downtown-Case** commented the **2025-06-02** at **18:19:21**:<br>
|
|
|
|
Seems their implementation needs more time in the oven anyway.
|
|
|
|
---
|
|
|
|
👤 **Downtown-Case** commented the **2025-06-27** at **14:31:42**:<br>
|
|
|
|
Closing this |