mirror of
https://github.com/ikawrakow/ik_llama.cpp.git
synced 2026-02-19 20:54:36 +00:00
1.0 KiB
1.0 KiB
📝 #452 - Falcon H1 Support
| Author | Downtown-Case |
|---|---|
| State | ❌ Closed |
| Created | 2025-05-23 |
| Updated | 2025-06-27 |
Description
A hybrid transformers/mamba2 series with good performance: https://huggingface.co/collections/tiiuae/falcon-h1-6819f2795bc406da60fab8df
Officially supported via their fork of llama.cpp here: https://github.com/tiiuae/llama.cpp-Falcon-H1
Support for ik_llama.cpp's tighter quantization schemes would be nice :). Maybe something in this fork can shrink the Mamba2 context cache as well?
💬 Conversation
👤 ikawrakow commented the 2025-05-24 at 07:04:24:
Have you though about adding a feature request to the llama.cpp-Falcon-H1 authors?
👤 Downtown-Case commented the 2025-06-02 at 18:19:21:
Seems their implementation needs more time in the oven anyway.
👤 Downtown-Case commented the 2025-06-27 at 14:31:42:
Closing this