[docs]: add kt-cli doc and update corresponding website (#1768)

This commit is contained in:
ZiWei Yuan
2025-12-29 23:06:22 +08:00
committed by GitHub
parent 9539ab91eb
commit b096b01fbc
5 changed files with 50 additions and 3 deletions

2
.gitignore vendored
View File

@@ -29,4 +29,4 @@ csrc/demo
build*
CMakeFiles/
kvc2/
sched/
sched/

View File

@@ -18,6 +18,7 @@
<!-- # For Developer
- [Makefile Usage](en/makefile_usage.md) -->
- [kt-kernel part](en/kt-kernel/README.md)
- [kt-cli](en/kt-kernel/kt-cli.md)
# FAQ
- [FAQ](en/FAQ.md)
<!-- # V3 Reproduction

View File

@@ -1,2 +1 @@
# kt-kernel Docs
To be written...
# kt-kernel Docs

View File

@@ -0,0 +1,43 @@
# KT-CLI
> ⚠️ **Note:** This feature is currently under active development. Many functionalities are not yet complete and are being improved. Please stay tuned for updates.
## Design Philosophy
KT-CLI is designed to **minimize the burden of reading documentation**. Instead of requiring users to read lengthy docs, the CLI provides:
- **Interactive Mode**: Run commands without arguments to get step-by-step guided prompts
- **Direct Mode**: Pass arguments directly for automation and scripting
> 💡 **Tip:** The arguments are fully compatible with the previous SGLang + KTransformers approach, so you can migrate seamlessly.
Simply run a command, and the CLI will interactively guide you through the process!
## Usage
You can check the usage by `kt --help`
```
kt [OPTIONS] COMMAND [ARGS]...
```
KTransformers CLI - A unified command-line interface for KTransformers.
## Options
| Option | Description |
|--------|-------------|
| `--help` | Show this message and exit. |
## Commands
| Command | Description |
|---------|-------------|
| `version` | Show version information |
| `chat` | Interactive chat with running model |
| `quant` | Quantize model weights |
| `bench` | Run full benchmark |
| `microbench` | Run micro-benchmark |
| `doctor` | Diagnose environment issues |
| `model` | Manage models and storage paths |
| `config` | Manage configuration |
| `sft` | Fine-tuning with LlamaFactory |

View File

@@ -32,6 +32,10 @@ High-performance kernel operations for KTransformers, featuring CPU-optimized Mo
-**Kimi-K2 Native INT4 (RAWINT4)**: Supported on AVX512 CPUs (CPU-GPU shared INT4 weights) - [Guide](https://github.com/kvcache-ai/ktransformers/blob/main/doc/en/kt-kernel/Kimi-K2-Thinking-Native.md)
-**FP8 weights (e.g., MiniMax-M2.1)**: Supported on AVX512 CPUs (CPU-GPU shared FP8 weights) - [Guide](https://github.com/kvcache-ai/ktransformers/blob/main/doc/en/kt-kernel/MiniMax-M2.1-Tutorial.md)
**KT-CLI**
We are developing a simpler way to use KTransformers. Check out the [KT-CLI Guide](https://github.com/kvcache-ai/ktransformers/blob/main/doc/en/kt-kernel/kt-cli.md) for more details.
## Features
- **CPU-Optimized MoE Kernels**: High-throughput MoE expert kernels optimized for instruction sets.