mirror of
https://github.com/kvcache-ai/ktransformers.git
synced 2026-04-29 18:51:15 +00:00
@@ -23,6 +23,7 @@ Our vision for KTransformers is to serve as a flexible platform for experimentin
|
|||||||
|
|
||||||
<h2 id="Updates">🔥 Updates</h2>
|
<h2 id="Updates">🔥 Updates</h2>
|
||||||
|
|
||||||
|
* **July 11, 2025**: Support Kimi-K2-0905. ([Tutorial](./doc/en/Kimi-K2.md))
|
||||||
* **July 26, 2025**: Support SmallThinker and GLM4-MoE. ([Tutorial](./doc/en/SmallThinker_and_Glm4moe.md))
|
* **July 26, 2025**: Support SmallThinker and GLM4-MoE. ([Tutorial](./doc/en/SmallThinker_and_Glm4moe.md))
|
||||||
* **July 11, 2025**: Support Kimi-K2. ([Tutorial](./doc/en/Kimi-K2.md))
|
* **July 11, 2025**: Support Kimi-K2. ([Tutorial](./doc/en/Kimi-K2.md))
|
||||||
* **June 30, 2025**: Support 3-layer (GPU-CPU-Disk) [prefix cache](./doc/en/prefix_cache.md) reuse.
|
* **June 30, 2025**: Support 3-layer (GPU-CPU-Disk) [prefix cache](./doc/en/prefix_cache.md) reuse.
|
||||||
|
|||||||
@@ -3,7 +3,7 @@
|
|||||||
## Introduction
|
## Introduction
|
||||||
|
|
||||||
### Overview
|
### Overview
|
||||||
We are very pleased to announce that Ktransformers now supports Kimi-K2.
|
We are very pleased to announce that Ktransformers now supports Kimi-K2 and Kimi-K2-0905.
|
||||||
|
|
||||||
On a single-socket CPU with one consumer-grade GPU, running the Q4_K_M model yields roughly 10 TPS and requires about 600 GB of DRAM.
|
On a single-socket CPU with one consumer-grade GPU, running the Q4_K_M model yields roughly 10 TPS and requires about 600 GB of DRAM.
|
||||||
With a dual-socket CPU and sufficient system memory, enabling NUMA optimizations increases performance to about 14 TPS.
|
With a dual-socket CPU and sufficient system memory, enabling NUMA optimizations increases performance to about 14 TPS.
|
||||||
@@ -14,6 +14,10 @@ With a dual-socket CPU and sufficient system memory, enabling NUMA optimizations
|
|||||||
- https://huggingface.co/collections/moonshotai/kimi-k2-6871243b990f2af5ba60617d
|
- https://huggingface.co/collections/moonshotai/kimi-k2-6871243b990f2af5ba60617d
|
||||||
- GGUF Format(quantized models):
|
- GGUF Format(quantized models):
|
||||||
- https://huggingface.co/KVCache-ai/Kimi-K2-Instruct-GGUF
|
- https://huggingface.co/KVCache-ai/Kimi-K2-Instruct-GGUF
|
||||||
|
- Official Kimi-K2-0905 Release:
|
||||||
|
- https://huggingface.co/moonshotai/Kimi-K2-Instruct-0905
|
||||||
|
- GGUF Format(quantized models):
|
||||||
|
- Uploading...
|
||||||
|
|
||||||
## Installation Guide
|
## Installation Guide
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user