ik_llama.cpp/llama.h at 1966eb2615242f224bf9ca939db8905ab6a174a0

mirror of https://github.com/ikawrakow/ik_llama.cpp.git synced 2026-04-28 10:21:48 +00:00

Files

jiez 1966eb2615 quantize : add '--keep-split' to quantize model into shards (#6688 )

* Implement '--keep-split' to quantize model into several shards

* Add test script

* Update examples/quantize/quantize.cpp

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* Split model correctly even if tensor id is out-of-order

* Update llama_model_quantize_params

* Fix preci failures

---------

Co-authored-by: z5269887 <z5269887@unsw.edu.au>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

2024-04-25 13:29:35 +03:00

51 KiB

Raw Blame History

View Raw

51 KiB Raw Blame History

51 KiB

Raw Blame History