ik_llama.cpp/llama.cpp at a40c1d87ff3c138ad4c629649624a48c5dcadd6f

mirror of https://github.com/ikawrakow/ik_llama.cpp.git synced 2026-03-09 21:40:22 +00:00

Files

Kawrakow ab18bc5e87 k_quants tuning for Falcon-7b (#2816 )

* Make ggml-cuda.cu build with QK_K = 64

Using LLAMA_CUDA_FORCE_DMMV = ON and -nommq it runs and produces
a meaningful result.

* k_quants tuning for Falcon-7b

---------

Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>

2023-08-27 15:19:59 +03:00

221 KiB

Raw Blame History

View Raw

221 KiB Raw Blame History

221 KiB

Raw Blame History