ik_llama.cpp

mirror of https://github.com/ikawrakow/ik_llama.cpp.git synced 2026-03-13 07:20:15 +00:00

Files

Kawrakow 85c6152e85 Give the user the option to override where model weights are stored (#232 )

* Give the user the option to override where model weights are stored

* Fix ggml_nbytes() problem and cleanup

For a tensor with zero elements ggml_nbytes() was returning
uint64_t::max, and this was causing graph allocation failure.

* Add timing info to CUDA graph evaluation

* Add more timing info

---------

Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>

2025-02-25 17:55:58 +02:00

llama.h

Give the user the option to override where model weights are stored (#232 )

2025-02-25 17:55:58 +02:00