Don't ignore the return value of create_tensors()

else, when q, k, v get merged and we are running on the CPU,
we get a crash because the backend is trying to use mmap,
but that no longer works.
This commit is contained in:
Iwan Kawrakow
2025-10-29 11:15:20 +02:00
parent 2b3af4addc
commit 6c53a97122

View File

@@ -1684,7 +1684,7 @@ static bool llm_load_tensors(
throw std::runtime_error("model has expert layers but no expert layers are used");
}
cth->create_tensors();
use_mmap_buffer = cth->create_tensors();
ml.done_getting_tensors();