Don't ignore the return value of create_tensors()

else, when q, k, v get merged and we are running on the CPU, we get a crash because the backend is trying to use mmap, but that no longer works.
2026-05-11 08:30:19 +00:00 · 2025-10-29 11:15:20 +02:00
parent 2b3af4addc
commit 6c53a97122
1 changed files with 1 additions and 1 deletions
--- a/src/llama.cpp
+++ b/src/llama.cpp
@@ -1684,7 +1684,7 @@ static bool llm_load_tensors(
        throw std::runtime_error("model has expert layers but no expert layers are used");
    }

-    cth->create_tensors();
+    use_mmap_buffer = cth->create_tensors();

    ml.done_getting_tensors();