### 📝 [#452](https://github.com/ikawrakow/ik_llama.cpp/issues/452) - Falcon H1 Support

| **Author** | `Downtown-Case` |
| :--- | :--- |
| **State** | ❌ **Closed** |
| **Created** | 2025-05-23 |
| **Updated** | 2025-06-27 |

---

#### Description

A hybrid transformers/mamba2 series with good performance: https://huggingface.co/collections/tiiuae/falcon-h1-6819f2795bc406da60fab8df

Officially supported via their fork of llama.cpp here: https://github.com/tiiuae/llama.cpp-Falcon-H1

Support for ik_llama.cpp's tighter quantization schemes would be nice :). Maybe something in this fork can shrink the Mamba2 context cache as well?

---

#### 💬 Conversation

👤 **ikawrakow** commented the **2025-05-24** at **07:04:24**:<br>

Have you though about adding a feature request to the llama.cpp-Falcon-H1 authors?

---

👤 **Downtown-Case** commented the **2025-06-02** at **18:19:21**:<br>

Seems their implementation needs more time in the oven anyway.

---

👤 **Downtown-Case** commented the **2025-06-27** at **14:31:42**:<br>

Closing this