Commit Graph

100 Commits

Author SHA1 Message Date
Peilin Li
171578a7ec [refactor]: Change named 'KT-SFT' to 'kt-sft' (#1626)
* Change named 'KT-SFT' to 'kt-sft'

* [docs]: update kt-sft name

---------

Co-authored-by: ZiWei Yuan <yzwliam@126.com>
2025-11-17 11:48:42 +08:00
ZiWei Yuan
550e4986f5 [docs]: update README.md (#1616)
* [docs]: update README.md
2025-11-15 20:56:26 +08:00
ZiWei Yuan
7c2ad6dbca [docs]: update README.md (#1614)
* [docs]: update README.md
2025-11-15 18:34:27 +08:00
ErvinXie
5179f0d634 Add roadmap link to README (#1585) 2025-11-10 18:15:53 +08:00
Jiaqi Liao
07322ca2bd Refactor: restructure repository to focus on kt-kernel and KT-SFT modules (#1583)
* refactor repo

* fix README
2025-11-10 17:57:48 +08:00
ErvinXie
2cb1674020 Fix image reference in README.md (#1584)
Updated image reference in README for heterogeneous computing.
2025-11-10 17:53:41 +08:00
Jiaqi Liao
57d14d22bc Refactor: restructure repository to focus on kt-kernel and KT-SFT modulesq recon (#1581)
* refactor: move legacy code to archive/ directory

  - Moved ktransformers, csrc, third_party, merge_tensors to archive/
  - Moved build scripts and configurations to archive/
  - Kept kt-kernel, KT-SFT, doc, and README files in root
  - Preserved complete git history for all moved files

* refactor: restructure repository to focus on kt-kernel and KT-SFT modules

* fix README

* fix README

* fix README

* fix README

* docs: add performance benchmarks to kt-kernel section

Add comprehensive performance data for kt-kernel to match KT-SFT's presentation:
- AMX kernel optimization: 21.3 TFLOPS (3.9× faster than PyTorch)
- Prefill phase: up to 20× speedup vs baseline
- Decode phase: up to 4× speedup
- NUMA optimization: up to 63% throughput improvement
- Multi-GPU (8×L20): 227.85 tokens/s total throughput with DeepSeek-R1 FP8

Source: https://lmsys.org/blog/2025-10-22-KTransformers/

This provides users with concrete performance metrics for both core modules,
making it easier to understand the capabilities of each component.

* refactor: improve kt-kernel performance data with specific hardware and models

Replace generic performance descriptions with concrete benchmarks:
- Specify exact hardware: 8×L20 GPU + Xeon Gold 6454S, Single/Dual-socket Xeon + AMX
- Include specific models: DeepSeek-R1-0528 (FP8), DeepSeek-V3 (671B)
- Show detailed metrics: total throughput, output throughput, concurrency details
- Match KT-SFT presentation style for consistency

This provides users with actionable performance data they can use to evaluate
hardware requirements and expected performance for their use cases.

* fix README

* docs: clean up performance table and improve formatting

* add pic for README

* refactor: simplify .gitmodules and backup legacy submodules

- Remove 7 legacy submodules from root .gitmodules (archive/third_party/*)
- Keep only 2 active submodules for kt-kernel (llama.cpp, pybind11)
- Backup complete .gitmodules to archive/.gitmodules
- Add documentation in archive/README.md for researchers who need legacy submodules

This reduces initial clone size by ~500MB and avoids downloading unused dependencies.

* refactor: move doc/ back to root directory

Keep documentation in root for easier access and maintenance.

* refactor: consolidate all images to doc/assets/

- Move kt-kernel/assets/heterogeneous_computing.png to doc/assets/
- Remove KT-SFT/assets/ (images already in doc/assets/)
- Update KT-SFT/README.md image references to ../doc/assets/
- Eliminates ~7.9MB image duplication
- Centralizes all documentation assets in one location

* fix pic path for README
2025-11-10 17:42:26 +08:00
Atream
86229c852d Add update for Kimi-K2-Thinking support 2025-11-06 17:56:46 +08:00
ovowei
44e47ad75a update readme.md 2025-11-05 23:30:58 +08:00
ovowei
00f038e763 update readme.md 2025-11-05 23:29:59 +08:00
ovowei
1e17d75bfd fix 2025-10-30 10:47:05 +08:00
ovowei
ca21992e46 update readme.md. (Support Ascend NPU) 2025-10-27 20:53:06 +08:00
Atream
8ef6111ae0 Update README with Citation link 2025-10-10 19:12:31 +08:00
Atream
1e48eab7d5 Add citation section to README
Added citation section with reference to KTransformers paper.
2025-10-10 18:59:29 +08:00
Atream
e93abc93ec Add SGLang Integration to README.md 2025-10-10 18:50:05 +08:00
Jianwei Dong
d4b3fe2427 Merge branch 'main' into support-qwen3next 2025-09-12 21:59:32 +08:00
djw
a44b710649 support qwen3 next 2025-09-11 11:55:09 +00:00
Azure
24fe61bbc3 Update date for Kimi-K2-0905 support 2025-09-05 17:47:17 +08:00
Azure-Tang
b6d36bffbb update kimi-k2-0905 2025-09-05 03:52:43 +00:00
qiyuxinlin
1334ddc833 update readme 2025-07-25 17:02:36 +00:00
Atream
cf79c93fae Update README.md 2025-07-11 09:35:12 +08:00
Atream
18690d819f Update README.md 2025-07-11 09:34:07 +08:00
ErvinXie
aadf31b35d Update README.md 2025-06-30 17:55:49 +08:00
ErvinXie
a9a72e52c3 Update README.md 2025-06-30 14:56:46 +08:00
liam Yuan
22d0d9ccb2 update vendor ZTE name 2025-06-23 21:07:17 +08:00
liam Yuan
cb77b52c63 update vendor support list 2025-06-23 21:00:01 +08:00
Atream
d051a14941 Update README.md 2025-05-15 10:29:43 +08:00
rnwang04
142fb7ce6c Enable support for Intel XPU devices, add support for DeepSeek V2/V3 first 2025-05-14 19:37:27 +00:00
Atream
7ebf82a492 Update Qwen3 date 2025-04-29 09:43:13 +08:00
qiyuxinlin
a3ba63665a update readme 2025-04-28 22:38:41 +00:00
qiyuxinlin
89823ccb1f update readme 2025-04-28 22:34:47 +00:00
qiyuxinlin
e7763a4b59 update readme 2025-04-28 22:32:35 +00:00
qiyuxinlin
d3ebdafd4b update readme 2025-04-28 22:31:09 +00:00
qiyuxinlin
59b0631e33 update readme 2025-04-28 22:26:38 +00:00
qiyuxinlin
8f76c37d86 fix readme 2025-04-28 22:17:22 +00:00
qiyuxinlin
cb5617b479 update readme 2025-04-28 22:14:23 +00:00
djw
26798500bd update llama4 tutorial 2025-04-09 09:40:08 +00:00
djw
f73b4ca706 update llama4 tutorial 2025-04-09 09:36:30 +00:00
dongjw
4ed9744ebb update readme 2025-04-02 14:02:57 +08:00
dongjw
b62cefaec9 update readme 2025-04-02 13:11:01 +08:00
Azure-Tang
3a5330b215 Merge branch 'main' into work-concurrent 2025-04-01 06:48:19 +00:00
Atream
25cee5810e add balance-serve, support concurrence 2025-03-31 22:55:32 +08:00
Atream
d4c6c2bb02 Update README.md 2025-03-22 12:14:36 +08:00
liam
4748a912e2 📝 fix typo ktransformer->ktransformers 2025-03-17 17:54:00 +08:00
Azure-Tang
e5b001d76f Update readme; Format code; Add example yaml. 2025-03-14 14:25:52 -04:00
Azure
034a116365 update readme 2025-03-05 10:04:43 +00:00
Azure
91c1619296 Merge branch 'develop-0.2.2' into support-fp8
Update README.md
2025-02-25 13:43:26 +00:00
Azure
36fbeee341 Update doc 2025-02-25 08:21:18 +00:00
_
5ed441a0f5 Update README.md 2025-02-21 14:15:50 +00:00
liam
13382f88ab 📝 update V0.2.1 Doc 2025-02-15 16:17:05 +08:00