mirror of
https://github.com/kvcache-ai/ktransformers.git
synced 2026-04-20 14:29:22 +00:00
update readme
This commit is contained in:
11
README.md
11
README.md
@@ -23,19 +23,14 @@ Our vision for KTransformers is to serve as a flexible platform for experimentin
|
||||
|
||||
<h2 id="Updates">🔥 Updates</h2>
|
||||
|
||||
* **July 26, 2025**: Support SmallThinker and GLM4-MoE. ([Tutorial](./doc/en/SmallThinker_and_Glm4moe.md))
|
||||
* **July 11, 2025**: Support Kimi-K2. ([Tutorial](./doc/en/Kimi-K2.md))
|
||||
|
||||
* **June 30, 2025**: Support 3-layer (GPU-CPU-Disk) [prefix cache](./doc/en/prefix_cache.md) reuse.
|
||||
|
||||
* **May 14, 2025**: Support Intel Arc GPU ([Tutorial](./doc/en/xpu.md)).
|
||||
|
||||
* **Apr 29, 2025**: Support AMX-Int8、 AMX-BF16 and Qwen3MoE ([Tutorial](./doc/en/AMX.md))
|
||||
|
||||
https://github.com/user-attachments/assets/fafe8aec-4e22-49a8-8553-59fb5c6b00a2
|
||||
|
||||
|
||||
|
||||
|
||||
* **Apr 9, 2025**: Experimental support for LLaMA 4 models ([Tutorial](./doc/en/llama4.md)).
|
||||
* **Apr 2, 2025**: Support Multi-concurrency. ([Tutorial](./doc/en/balance-serve.md)).
|
||||
|
||||
@@ -65,7 +60,7 @@ https://github.com/user-attachments/assets/ebd70bfa-b2c1-4abb-ae3b-296ed38aa285
|
||||
</p>
|
||||
|
||||
- **[NEW!!!] Local 671B DeepSeek-Coder-V3/R1:** Running its Q4_K_M version using only 14GB VRAM and 382GB DRAM([Tutorial](./doc/en/DeepseekR1_V3_tutorial.md)).
|
||||
|
||||
|
||||
- Prefill Speed (tokens/s):
|
||||
- KTransformers: 54.21 (32 cores) → 74.362 (dual-socket, 2×32 cores) → 255.26 (optimized AMX-based MoE kernel, V0.3 only) → 286.55 (selectively using 6 experts, V0.3 only)
|
||||
- Compared to 10.31 tokens/s in llama.cpp with 2×32 cores, achieving up to **27.79× speedup**.
|
||||
@@ -131,7 +126,6 @@ we have already supported vendors:
|
||||
- Kunpeng
|
||||
- AMD
|
||||
|
||||
|
||||
### 📥 Installation
|
||||
|
||||
To install KTransformers, follow the official [Installation Guide](https://kvcache-ai.github.io/ktransformers/en/install.html).
|
||||
@@ -201,3 +195,4 @@ If you have any questions, feel free to open an issue. Alternatively, you can jo
|
||||
<h2 id="FAQ">🙋 FAQ</h2>
|
||||
|
||||
Some common questions are answered in the [FAQ](doc/en/FAQ.md).
|
||||
|
||||
|
||||
Reference in New Issue
Block a user