* [feat]: redesign kt run interactive configuration with i18n support
- Redesign kt run with 8-step interactive flow (model selection, inference method, NUMA/CPU, GPU experts, KV cache, GPU/TP selection, parsers, host/port)
- Add configuration save/load system (~/.ktransformers/run_configs.yaml)
- Add i18n support for kt chat (en/zh translations)
- Add universal input validators with auto-retry and Chinese comma support
- Add port availability checker with auto-suggestion
- Add parser configuration (--tool-call-parser, --reasoning-parser)
- Remove tuna command and clean up redundant files
- Fix: variable reference bug in run.py, filter to show only MoE models
* [feat]: unify model selection UI and enable shared experts fusion by default
- Unify kt run model selection table with kt model list display
* Add Total size, MoE Size, Repo, and SHA256 status columns
* Use consistent formatting and styling
* Improve user decision-making with more information
- Enable --disable-shared-experts-fusion by default
* Change default value from False to True
* Users can still override with --enable-shared-experts-fusion
* [feat]: improve kt chat with performance metrics and better CJK support
- Add performance metrics display after each response
* Total time, TTFT (Time To First Token), TPOT (Time Per Output Token)
* Accurate input/output token counts using model tokenizer
* Fallback to estimation if tokenizer unavailable
* Metrics shown in dim style (not prominent)
- Fix Chinese character input issues
* Replace Prompt.ask() with console.input() for better CJK support
* Fixes backspace deletion showing half-characters
- Suppress NumPy subnormal warnings
* Filter "The value of the smallest subnormal" warnings
* Cleaner CLI output on certain hardware environments
* [fix]: correct TTFT measurement in kt chat
- Move start_time initialization before API call
- Previously start_time was set when receiving first chunk, causing TTFT ≈ 0ms
- Now correctly measures time from request sent to first token received
* [docs]: 添加 Clawdbot 集成指南 - KTransformers 企业级 AI 助手部署方案
* [docs]: 强调推荐使用 Kimi K2.5 作为核心模型,突出企业级推理能力
* [docs]: 添加 Clawdbot 飞书接入教程链接
* [feat]: improve CLI table display, model verification, and chat experience
- Add sequence number (#) column to all model tables by default
- Filter kt edit to show only MoE GPU models (exclude AMX)
- Extend kt model verify to check *.json and *.py files in addition to weights
- Fix re-verification bug where repaired files caused false failures
- Suppress tokenizer debug output in kt chat token counting
* [fix]: fix cpu cores.
---------
Co-authored-by: skqliao <skqliao@gmail.com>