ik_llama.cpp/.gitignore at 9e824bf15c72114cf601c2c19b9fcb9adf528666

mirror of https://github.com/ikawrakow/ik_llama.cpp.git synced 2026-04-29 10:51:51 +00:00

Files

Georgi Gerganov 73a59affb2 ggml : use 8-bit precision for Q4_1 intermediate results (#1047 )

* ggml : use 8-bit precision for Q4_1 intermediate results (ARM)

* ggml : optimize ggml_vec_dot_q4_1_q8_0() via vmalq_n_f32

56 ms/token with Q4_1 !

* ggml : AVX2 implementation of ggml_vec_dot_q4_1_q8_0 (#1051)

* gitignore : ignore ppl-*.txt files

---------

Co-authored-by: slaren <2141330+slaren@users.noreply.github.com>

2023-04-19 20:10:08 +03:00

391 B

Raw Blame History

View Raw

391 B Raw Blame History

391 B

Raw Blame History