sglang

mirror of https://github.com/kvcache-ai/sglang.git synced 2026-07-01 04:08:10 +00:00

Author	SHA1	Message	Date
Brayden Zhong	6a9b09847c	CUTLASS NVFP4 GEMM improvement of SM120 (#21314 )	2026-04-01 09:04:34 +08:00
Артем Савкин	27071e0a43	[NPU] Update quantization&CI documentation (#21100 ) Co-authored-by: Tamir Baydasov <41994229+TamirBaydasov@users.noreply.github.com>	2026-03-28 21:42:21 +03:00
Mook	23c191afb6	fix(docs): correct quantization documentation (#20301 ) (#20619 )	2026-03-15 12:33:12 -04:00
Brayden Zhong	591e61245a	[Doc] Add smal table for GEMM backends (#20213 )	2026-03-09 22:19:57 -07:00
Bruce Changlong Xu	feda2b11c4	[AMD] Add AWQ AMD CI coverage and quantization platform compatibility docs (#19550 )	2026-03-04 19:50:55 -08:00
Zack Yu	54589a2f2d	docs: expand and update modelopt documentation (#18479 ) Co-authored-by: Brayden Zhong <b8zhong@uwaterloo.ca> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2026-02-09 23:09:52 +00:00
fxmarty-amd	5af84c8af5	[AMD][Quantization] Add `int4fp8_moe` online quantization on ROCm (#7392 ) Co-authored-by: Dehua Tang <dehtang@amd.com> Co-authored-by: HAI <hixiao@gmail.com> Co-authored-by: YC Tseng <yctseng@amd.com>	2026-01-14 01:44:40 -08:00
Zhiyu	f6423b626c	Rename TensorRT Model Optimizer to Model Optimizer (#14455 ) Signed-off-by: Zhiyu Cheng <zhiyuc@nvidia.com>	2025-12-07 13:18:20 -08:00
b8zhong	88d1bab537	add doc for quantized kv cache (#14348 ) Co-authored-by: Brayden Zhong <b8zhong@users.noreply.github.com> Co-authored-by: Ho-Ren (Jack) Chuang <horenchuang@bytedance.com>	2025-12-04 13:01:05 -08:00
赵晨阳	c56fc42430	Update quantization.md with new model resources (#13677 )	2025-11-20 15:50:16 -08:00
Weiwei	caa4819bfc	Add support for AutoRound quantized models (#10153 )	2025-10-27 18:17:29 +08:00
Zhiyu	80b2b3207a	Enable native ModelOpt quantization support (3/3) (#10154 ) Signed-off-by: Zhiyu Cheng <zhiyuc@nvidia.com>	2025-10-21 21:44:29 -07:00
Yineng Zhang	b7d1f17b8d	Revert "enable auto-round quantization model (#6226 )" (#10148 )	2025-09-07 22:31:11 -07:00
Weiwei	c8295d2353	enable auto-round quantization model (#6226 ) Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>	2025-09-07 22:05:35 -07:00
Lianmin Zheng	2449a0afe2	Refactor the docs (#9031 )	2025-08-10 19:49:45 -07:00

15 Commits