Files
ik_llama.cpp/ggml
Iwan Kawrakow c69d04f324 gpt-oss: WIP llama
Model loads and runs (CPU only), but PPL is much to high
(~1500 for 1st batch vs ~200 in mainline).
Is it because of SWA, because of vocab, or did I introduce a bug somewhere?
2025-08-10 10:09:42 +03:00
..
2024-07-27 07:55:01 +02:00
2025-08-10 10:09:42 +03:00
2024-07-27 07:55:01 +02:00