Brayden Zhong
|
6a9b09847c
|
CUTLASS NVFP4 GEMM improvement of SM120 (#21314)
|
2026-04-01 09:04:34 +08:00 |
|
Артем Савкин
|
27071e0a43
|
[NPU] Update quantization&CI documentation (#21100)
Co-authored-by: Tamir Baydasov <41994229+TamirBaydasov@users.noreply.github.com>
|
2026-03-28 21:42:21 +03:00 |
|
Mook
|
23c191afb6
|
fix(docs): correct quantization documentation (#20301) (#20619)
|
2026-03-15 12:33:12 -04:00 |
|
Brayden Zhong
|
591e61245a
|
[Doc] Add smal table for GEMM backends (#20213)
|
2026-03-09 22:19:57 -07:00 |
|
Bruce Changlong Xu
|
feda2b11c4
|
[AMD] Add AWQ AMD CI coverage and quantization platform compatibility docs (#19550)
|
2026-03-04 19:50:55 -08:00 |
|
Zack Yu
|
54589a2f2d
|
docs: expand and update modelopt documentation (#18479)
Co-authored-by: Brayden Zhong <b8zhong@uwaterloo.ca>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2026-02-09 23:09:52 +00:00 |
|
fxmarty-amd
|
5af84c8af5
|
[AMD][Quantization] Add int4fp8_moe online quantization on ROCm (#7392)
Co-authored-by: Dehua Tang <dehtang@amd.com>
Co-authored-by: HAI <hixiao@gmail.com>
Co-authored-by: YC Tseng <yctseng@amd.com>
|
2026-01-14 01:44:40 -08:00 |
|
Zhiyu
|
f6423b626c
|
Rename TensorRT Model Optimizer to Model Optimizer (#14455)
Signed-off-by: Zhiyu Cheng <zhiyuc@nvidia.com>
|
2025-12-07 13:18:20 -08:00 |
|
b8zhong
|
88d1bab537
|
add doc for quantized kv cache (#14348)
Co-authored-by: Brayden Zhong <b8zhong@users.noreply.github.com>
Co-authored-by: Ho-Ren (Jack) Chuang <horenchuang@bytedance.com>
|
2025-12-04 13:01:05 -08:00 |
|
赵晨阳
|
c56fc42430
|
Update quantization.md with new model resources (#13677)
|
2025-11-20 15:50:16 -08:00 |
|
Weiwei
|
caa4819bfc
|
Add support for AutoRound quantized models (#10153)
|
2025-10-27 18:17:29 +08:00 |
|
Zhiyu
|
80b2b3207a
|
Enable native ModelOpt quantization support (3/3) (#10154)
Signed-off-by: Zhiyu Cheng <zhiyuc@nvidia.com>
|
2025-10-21 21:44:29 -07:00 |
|
Yineng Zhang
|
b7d1f17b8d
|
Revert "enable auto-round quantization model (#6226)" (#10148)
|
2025-09-07 22:31:11 -07:00 |
|
Weiwei
|
c8295d2353
|
enable auto-round quantization model (#6226)
Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>
|
2025-09-07 22:05:35 -07:00 |
|
Lianmin Zheng
|
2449a0afe2
|
Refactor the docs (#9031)
|
2025-08-10 19:49:45 -07:00 |
|