ktransformers/doc/en at 71f683acecd1e034cd2f942ed478fcb74dd92e93 - ktransformers - Public git mirror

kvcache-ai/ktransformers

mirror of https://github.com/kvcache-ai/ktransformers.git synced 2026-04-20 06:18:59 +00:00

Files

History

ErvinXie 71f683acec Support Native Kimi K2 Thinking (#1663 )

* [feat]: fix k2 prefill

* Update Kimi-K2-Thinking.md

* Create Kimi-K2-Thinking-Native.md

* Update Kimi-K2-Thinking.md

* Update Kimi-K2-Thinking.md

* Update Kimi-K2-Thinking-Native.md

* [perf] optimize K2 MoE weight loading with per-expert pointers

- Avoid expensive torch.stack().contiguous() in Python (was ~6.6s)
- Use per-expert pointer arrays (gate_projs) instead of contiguous memory
- C++ worker pool performs parallel memcpy for TP slicing
- Add LOAD_TIME_PROFILE for load_weights timing analysis

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: ouqingliang <1692110604@qq.com>
Co-authored-by: Claude <noreply@anthropic.com>

2025-12-05 21:53:05 +08:00

..

Necessary tips for Node.js related issues

2025-02-19 16:37:18 +08:00

[docs]: Add deepseek-v3.2 run tutorial (#1659 )

2025-12-02 20:04:10 +08:00

Initial commit

2024-07-27 16:06:58 +08:00

[refactor]: Change named 'KT-SFT' to 'kt-sft' (#1626 )

2025-11-17 11:48:42 +08:00

AMX.md

Update AMX.md

2025-04-29 11:12:51 +08:00

balance-serve.md

add flashinfer to cuda device

2025-05-15 07:03:45 +00:00

benchmark.md

⚡ release v0.2.3

2025-03-05 20:21:04 +08:00

deepseek-v2-injection.md

* Reorganize documentation/README

2025-02-14 19:58:26 +00:00

DeepseekR1_V3_tutorial.md

fix typo (#1452 )

2025-11-10 16:08:04 +08:00

Docker_xpu.md

docs: add Dockerfile.xpu and GPU driver setup instructions

2025-05-28 13:55:35 +08:00

Docker.md

📝 fix typo ktransformer->ktransformers

2025-03-17 17:54:00 +08:00

FAQ.md

[doc]: update web doc and kt-kernel doc (#1609 )

2025-11-13 20:44:13 +08:00

fp8_kernel.md

Update fp8 doc; Update install.md broken link

2025-02-26 15:43:08 +00:00

install.md

Merge pull request #1307 from kvcache-ai/hyc

2025-05-17 15:25:33 +08:00

Kimi-K2-Thinking-Native.md

Support Native Kimi K2 Thinking (#1663 )

2025-12-05 21:53:05 +08:00

Kimi-K2-Thinking.md

Support Native Kimi K2 Thinking (#1663 )

2025-12-05 21:53:05 +08:00

Kimi-K2.md

Update GGUF format link in Kimi-K2 documentation

2025-09-05 20:19:37 +08:00

Kllama_tutorial_DeepSeekV2Lite.ipynb

upload hands-on tutorial with KTransformers-FT, especially in customize your KT-FT+LLaMA-Factory (#1597 )

2025-11-11 20:54:41 +08:00

KTransformers Full Introduction for Motivation and Practice.pdf

[docs]: Add Full introduction of KT (#1636 )

2025-11-29 15:46:55 +08:00

KTransformers-FT_PPT_share.pdf

upload hands-on tutorial with KTransformers-FT, especially in customize your KT-FT+LLaMA-Factory (#1597 )

2025-11-11 20:54:41 +08:00

llama4.md

add flashinfer to cuda device

2025-05-15 07:03:45 +00:00

long_context_introduction.md

docs: update long_context_introduction.md

2024-08-30 03:34:39 +09:00

long_context_tutorial.md

update readme

2024-08-29 12:04:56 +08:00

makefile_usage.md

✨: rm sensitive info in config.yaml, add readme of makefile. support old model_path config

2024-11-04 14:02:19 +08:00

multi-gpu-tutorial.md

* Reorganize documentation/README

2025-02-14 19:58:26 +00:00

prefix_cache.md

update kvc disk path config.

2025-06-30 15:09:35 +00:00

Qwen3-Next.md

fix bug

2025-09-16 13:21:58 +00:00

ROCm.md

Update readme; Format code; Add example yaml.

2025-03-14 14:25:52 -04:00

SFT_Installation_Guide_KimiK2.md

Update SFT Installation Guide for KimiK2

2025-11-06 17:34:21 +08:00

SmallThinker_and_Glm4moe.md

update smallthinker and glm4 readme

2025-07-31 03:14:49 +00:00

V3-success.md

📝 ⚡ fix some debug output and update doc

2025-02-13 17:25:12 +08:00

xpu.md

docs: add Dockerfile.xpu and GPU driver setup instructions

2025-05-28 13:55:35 +08:00