mirror of
https://github.com/ikawrakow/ik_llama.cpp.git
synced 2026-02-25 23:54:10 +00:00
iq2_tn: TriLM specific 2.0625 bpw quantization
Quantize/dequantize/scale dot product. I get 46 t/s for the TriLM-3.9B with any SIMD! Finally a compiler doing a decent job auto-vectorizing the scalar implementation.
This commit is contained in:
@@ -174,6 +174,7 @@ extern "C" {
|
||||
LLAMA_FTYPE_MOSTLY_IQ3_K = 39, // except 1d tensors
|
||||
LLAMA_FTYPE_MOSTLY_IQ4_K = 40, // except 1d tensors
|
||||
LLAMA_FTYPE_MOSTLY_IQ5_K = 41, // except 1d tensors
|
||||
LLAMA_FTYPE_MOSTLY_IQ2_TN = 42, // except 1d tensors
|
||||
|
||||
LLAMA_FTYPE_GUESSED = 1024, // not specified in the model file
|
||||
};
|
||||
|
||||
Reference in New Issue
Block a user