mirror of
https://github.com/ikawrakow/ik_llama.cpp.git
synced 2026-02-28 00:54:09 +00:00
* iq1_tn: Metal implementation Rquires to change the get_rows and matrix multiplication kernels to use a dequantizer type rather than a dequantization function. But once this is done, we can simply reuse the iq1_bn implementation. This change will also allow to add other quantization types that have meta data (such as a row scale) stored at the beginning of a row (or change existing quantization types to row-wise scales). * Some cleanup --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>