POC: per row scale

This is a POC how to work around opinionated ggml to
have scales per row rather than per block.
Only implemened for Zen4 and only for iq2_tn.
This commit is contained in:
Iwan Kawrakow
2024-09-17 16:04:59 +03:00
parent 546f3ef349
commit 86237d0555
6 changed files with 90 additions and 37 deletions

View File

@@ -2517,6 +2517,7 @@ extern "C" {
int64_t ncols; // number of columns to process simultaneously
ggml_gemv_t gemv;
ggml_gemm_t gemm;
int64_t row_meta_size;
} ggml_type_traits_t;
GGML_API ggml_type_traits_t ggml_internal_get_type_traits(enum ggml_type type);