Commit Graph

  • c1cb22311b [build]: sync sglang submodule to 51032b71279d9038058563f8d2e758d99b278ef4 (#2032) main github-actions[bot] 2026-06-05 19:24:22 +08:00
  • 60eb08f8d8 [build]: sync sglang submodule to 51032b71279d9038058563f8d2e758d99b278ef4 auto/sync-sglang ovowei 2026-06-05 09:13:18 +00:00
  • a14b6dd5ec deploy: c9a915e6ac gh-pages jdai0 2026-06-05 08:57:40 +00:00
  • c9a915e6ac [feat](kt-lora): add end-to-end Qwen3.5 MoE KT LoRA serving workflow (#2031) Jiaheng Dai 2026-06-05 16:57:14 +08:00
  • d41f569e84 [fix](cli): detect SGLANG_DSV4_2604_SUBMODE conflict before launch (#2025) devangpratap 2026-05-30 07:20:47 -04:00
  • ef6c47f9d2 [feat](kt-kernel): AVX2 MXFP4 MoE MXFP4 dispatch (#2015) Benjamin 2026-05-30 19:20:16 +08:00
  • f1e2b82c74 [fix] Add runtime AMX BF16 check to prevent SIGILL on pre-Sapphire Rapids CPUs (#2018) Li Tingfang 2026-05-21 17:36:12 +08:00
  • b77e0975da deploy: eeeeae5e91 yyj6666667 2026-05-20 07:05:04 +00:00
  • eeeeae5e91 Fix duplicate BF16 loader definition (#1984) login256 2026-05-20 15:04:47 +08:00
  • f0772445a1 [perf]: native path for MXFP4 MoE on AVX512F (#2006) Jim James 2026-05-18 02:44:33 -05:00
  • 95e20f9c55 [build]: sync sglang submodule to ebaff7729b9e41c29d94f8d19a53473d321dc566 (#2005) github-actions[bot] 2026-05-14 22:25:31 +08:00
  • f05b4009f3 [fix](kt-kernel): fix double mem used by safetensor loader (#1997) Benjamin F 2026-05-11 12:00:30 +08:00
  • bb15fdf47e Release/0.6.2.post3: carry kt-kernel SwiGLU clamp companion missing from post2 v0.6.2.post3 Benjamin F 2026-05-10 03:55:02 +08:00
  • a0f9b299bc docs: add entrypoints and support matrix doc-reorg-entrypoints-support-matrix JimmyPeilinLi 2026-05-10 01:22:13 +08:00
  • 37db9a3b83 0.6.2.post2: submodule refactor and update tutorial (#1993) v0.6.2.post2 Benjamin F 2026-05-09 18:53:59 +08:00
  • f7c4fa68c5 [fix]: add guard for SFT MoE and remove guard for AMX FP4 MoE on AVX512F+BW (#1980) Jim James 2026-05-08 03:05:22 -05:00
  • c465557c23 docs(v4-flash): add optional AMXINT4 CPU-weight conversion path (#1986) Benjamin F 2026-05-08 15:35:05 +08:00
  • 8b9d233d42 docs(v4-flash): tilelang install, MTP flags, Ampere unsupported (#1979) Benjamin F 2026-05-06 17:29:38 +08:00
  • d7b5b49a3e [release]: 0.6.2.post1 v0.6.2.post1 Benjamin F 2026-05-03 21:07:23 +08:00
  • 96189972d8 build: bump sglang submodule to c9edb75e0 (V4-Flash GPU prefill fallback fix + perf) (#1975) Benjamin F 2026-05-03 19:42:19 +08:00
  • 088ed979d5 docs(v4-flash): pin transformers==4.57.1 in tutorial prerequisites (#1974) Benjamin F 2026-05-03 16:07:31 +08:00
  • 4b4312c0a2 release: bump version to 0.6.2 (#1973) v0.6.2 Benjamin F 2026-05-03 14:28:09 +08:00
  • bb3b6e8413 build: bump sglang submodule to 40d3a82 (V4-Flash flashinfer guard) (#1972) Benjamin F 2026-05-03 14:06:33 +08:00
  • 53f356c328 deploy: 041bdfc636 yyj6666667 2026-05-03 02:48:51 +00:00
  • 041bdfc636 [New Model] DeepSeek-V4-Flash: kt-kernel MXFP4 MoE + sglang hybrid inference (#1970) Benjamin F 2026-05-03 10:48:31 +08:00
  • fe06c4d355 [build]: sync sglang submodule to 537eb762b0881071a0e098bd78666fe052b83deb (#1967) github-actions[bot] 2026-05-02 12:42:04 +08:00
  • fb4e11db95 deploy: 02be2bf53f jdai0 2026-04-30 09:17:11 +00:00
  • 02be2bf53f [feat](kt-kernel): add AVX2/AVX-VNNI RAWINT4 MoE backend (#1942) Aliez Ren 2026-04-30 18:16:49 +09:00
  • 07f39626ae deploy: 8c634d5dca JimmyPeilinLi 2026-04-30 08:25:52 +00:00
  • 8c634d5dca [docs]: refresh kt inference and sft entry points Peilin Li 2026-04-30 16:25:34 +08:00
  • 24b1941b85 [fix]: point sglang extra to post2 (#1964) v0.6.1 Peilin Li 2026-04-30 11:57:02 +08:00
  • 72044ad65f [build]: bump v0.6.1 post1 package metadata v0.6.1.post1 Peilin Li 2026-04-30 01:02:44 +08:00
  • ef5822639f [fix](kt-kernel): pin torch 2.9.1 wheel baseline Peilin Li 2026-04-30 00:57:24 +08:00
  • 9f34ef46e6 [fix](Qwen3 series): fix gibberish output by correcting RoPE write-back (#31) (#1959) Benjamin F 2026-04-27 22:04:29 +08:00
  • 1cb73c7100 [fix](Qwen3 series): fix gibberish output by correcting RoPE write-back (#31) bump-sglang-pr31 yyj 2026-04-27 21:59:38 +08:00
  • 6fea6b7d99 deploy: 0656e01ac1 JimmyPeilinLi 2026-04-26 16:46:01 +00:00
  • 0656e01ac1 [docs]: refresh KT install commands (#1958) Peilin Li 2026-04-27 00:45:43 +08:00
  • d93ea7e21e [docs]: refresh KT install commands docs-v061-refresh JimmyPeilinLi 2026-04-26 16:29:20 +00:00
  • a7a575d41e [perf](kt-kernel): MXFP4 MoE add mat-mat 4×4 tile, refine mat-vec reduce (#1957) fp4-moe-amx Benjamin F 2026-04-26 17:34:08 +08:00
  • 07e274467a [build]: flatten ktransformers package shim (#1955) Peilin Li 2026-04-25 22:08:52 +08:00
  • d143cf3209 [build]: flatten ktransformers package shim flatten-ktransformers-shim JimmyPeilinLi 2026-04-25 14:01:20 +00:00
  • bfbd0e9352 [chore]: archive kt-sft package (#1954) Peilin Li 2026-04-25 21:49:21 +08:00
  • bdf01a24b2 [chore]: archive kt-sft package archive-kt-sft JimmyPeilinLi 2026-04-25 13:42:50 +00:00
  • 85f1ab530b [ci]: use hosted runner for sglang-kt release Peilin Li 2026-04-25 21:05:18 +08:00
  • bc7afff13b [chore]: sync sglang-kt packaging fix Peilin Li 2026-04-25 21:02:25 +08:00
  • 8484ef8b16 [feat](kt-kernel): adapt MXFP4 MoE backend for DeepSeek-V4-Flash (#1950) Benjamin F 2026-04-25 18:11:53 +08:00
  • eeaeb7bfd7 [build]: align kt-kernel torch support with v0.6.1 release (#1948) Peilin Li 2026-04-24 23:45:15 +08:00
  • 0e60e94b10 [build]: align kt-kernel torch support with v0.6.1 release kt-kernel-torch-range-v061 JimmyPeilinLi 2026-04-24 15:42:52 +00:00
  • 85308615b9 [build] prepare v0.6.1 SFT wheel packaging on main (#1945) Peilin Li 2026-04-24 12:08:38 +08:00
  • c7bf1be712 [build]: finalize py311+ wheel packaging defaults sft-whl JimmyPeilinLi 2026-04-24 04:01:51 +00:00
  • 161547cbe5 [build]: prepare 0.6.1 SFT wheel packaging on main JimmyPeilinLi 2026-04-23 09:22:57 +00:00
  • 4cd8cded34 [docs]: align install guides with explicit package flow sft JimmyPeilinLi 2026-04-22 09:44:01 +00:00
  • da53870bcb [chore]: remove integration patch package from top-level release JimmyPeilinLi 2026-04-22 09:33:35 +00:00
  • c41553a595 Revert "clean up dev artifacts: remove SFT design docs, debug examples, bench scripts" JimmyPeilinLi 2026-04-22 09:21:54 +00:00
  • ddfe92a07d [fix]: restore build wiring and track integration package JimmyPeilinLi 2026-04-22 09:21:49 +00:00
  • c9264e155c [build]: release v0.6.1 JimmyPeilinLi 2026-04-22 06:41:18 +00:00
  • 9544a8960d feat(sft): AMX MoE SFT backend with LoRA support (#1936) mrhaoxx 2026-04-22 11:27:01 +08:00
  • 948c75e76a remove dev version stamps from ext_bindings, sft_moe, moe-sft-tp mrhaoxx 2026-04-21 22:56:36 +08:00
  • a9bcee509c clean up dev artifacts: remove SFT design docs, debug examples, bench scripts mrhaoxx 2026-04-21 22:53:44 +08:00
  • 250e4fe52e merge: integrate origin/main into sft branch mrhaoxx 2026-04-21 22:40:07 +08:00
  • c4e88fb5af revert CMakeLists.txt to main: remove debug flags and cpptrace dep mrhaoxx 2026-04-21 20:56:02 +08:00
  • a789729923 align sft branch with main: revert worker_pool, strip sft_timer, fix inference defaults mrhaoxx 2026-04-21 17:39:56 +08:00
  • 6e45d02ebe deploy: 22e9915ec9 yyj6666667 2026-04-21 07:52:11 +00:00
  • 22e9915ec9 docs: add GOSIM 2026 announcement and update roadmap link to Q2 (#1937) ErvinXie 2026-04-21 15:51:08 +08:00
  • 5c5d7d48c0 [feat](kt-kernel): add MXFP4 MoE operator with E2M1 weights × BF16 activations ouqingliang 2026-04-21 02:53:04 +00:00
  • 168e10f254 [fix](sft): align Python API with C++ backend after v5 refactor JimmyPeilinLi 2026-04-20 16:44:09 +00:00
  • 00f9f8c0ef docs: add GOSIM 2026 announcement and update roadmap link to Q2 docs/gosim-2026-and-roadmap-link xwy 2026-04-20 22:25:02 +08:00
  • dd1da65d90 feat(sft): add Qwen3.5 MoE support + fused checkpoint loading mrhaoxx 2026-04-20 17:19:15 +08:00
  • 58d7eabb9b feat(sft): support transformers v5 fused expert format mrhaoxx 2026-04-20 13:21:29 +08:00
  • c3fa14f9c5 deploy: e327db58be ovowei 2026-04-18 13:30:33 +00:00
  • e327db58be Update README.md (#1935) Jianwei Dong 2026-04-18 21:30:13 +08:00
  • 17d9e49dd0 Update README.md ovowei-patch-2 Jianwei Dong 2026-04-18 21:28:46 +08:00
  • 92874ce177 deploy: a9f28d495b ovowei 2026-04-18 13:10:47 +00:00
  • a9f28d495b Update README.md (#1934) Jianwei Dong 2026-04-18 21:10:25 +08:00
  • b284e58f41 Update README.md ovowei-patch-1 Jianwei Dong 2026-04-18 21:10:07 +08:00
  • 06ee9f62f3 [doc]: add prerequisite note for GLM-5.1 tutorial (#1932) Benjamin F 2026-04-14 15:07:08 +08:00
  • a9411f1d72 Supports vnni-256 for GPTQ INT4 (#1926) callmegaga 2026-04-13 17:59:59 +08:00
  • f42e94a527 [fix](cli): handle edge cases with empty NUMA nodes (#1929) Andy18650 2026-04-13 16:45:41 +08:00
  • 6d4632b8c7 fix: add missing gpu_experts_mask=None to KTMoEWrapper call in SFT wrapper mrhaoxx 2026-04-10 02:18:40 +08:00
  • 5bfcb5f784 refactor(sft): share_backward_bb default True, share_cache_pool auto-derived mrhaoxx 2026-04-09 20:10:38 +08:00
  • 279c920a69 Revert "kt-kernel: enable CPUInfer stream bridge for ROCm (#1918)" (#1925) ErvinXie 2026-04-09 18:43:03 +08:00
  • 3d184a248e Revert "kt-kernel: enable CPUInfer stream bridge for ROCm (#1918)" revert-1918-fix/rocm-cpuinfer-stream-bridge ErvinXie 2026-04-09 18:42:46 +08:00
  • 020eb929f7 refactor(sft): unify KTConfig field names with kt_ prefix, add share_cache_pool, remove dead code mrhaoxx 2026-04-09 14:17:50 +08:00
  • 1dd0a78899 kt-kernel: enable CPUInfer stream bridge for ROCm (#1918) guanjiawei 2026-04-09 12:20:04 +08:00
  • 9b2d3b687b fix: remove broken symlink in archive/ktransformers/ (#1906) acture 2026-04-09 11:42:19 +08:00
  • ad19a3e653 Chore/kt layerwise prefill main (#1920) Oql 2026-04-09 11:28:37 +08:00
  • 7fd1b9dfe8 [chore]: use sglang main with KT layerwise prefill logs chore/kt-layerwise-prefill-main ouqingliang 2026-04-09 03:27:06 +00:00
  • dd59c7ebde [chore]: sync sglang submodule with KT layerwise prefill log rename ouqingliang 2026-04-09 03:25:44 +00:00
  • 38e95e3581 [chore]: update sglang submodule for KT layerwise prefill logs chore/kt-layerwise-prefill-label ouqingliang 2026-04-09 03:19:00 +00:00
  • a98d544833 merge: integrate origin/main into sft branch mrhaoxx 2026-04-08 23:19:28 +08:00
  • f36699affd feat(sft): AMX MoE SFT backend with LoRA support mrhaoxx 2026-04-08 23:11:00 +08:00
  • 07fd9328fa refactor(sft): move SFT logic into kt_kernel.sft submodule sft-rel mrhaoxx 2026-04-08 23:07:41 +08:00
  • 891c5c0a13 Support glm5.1 (#1916) Jianwei Dong 2026-04-07 11:29:32 +08:00
  • 4f43de3169 fix support-glm51-djw ovowei 2026-04-07 11:26:26 +08:00
  • 0357113610 support glm5.1 ovowei 2026-04-07 11:18:34 +08:00
  • 8a427c9321 [feat]: add AVX512F+BW fallback for FP8 and BF16 under AMX backend (#1908) Jim James 2026-04-03 00:46:22 -04:00
  • db9326302b chore: bump version to 0.5.3 (#1909) v0.5.3 Jianwei Dong 2026-04-01 18:58:48 +08:00
  • b21c848922 chore: bump version to 0.5.3 release/v0.5.3 ovowei 2026-04-01 18:57:11 +08:00
  • 9e6484a538 [fix]: fix --numa-nodes handling (#1904) Oql 2026-03-31 17:50:22 +08:00
  • 5cf573307e [chore]: unify kt-run numa handling chore/kt-run-numa-nodes-unified ouqingliang 2026-03-31 09:47:57 +00:00