Give the user the option to override where model weights are stored (#232)

mirror of https://github.com/ikawrakow/ik_llama.cpp.git synced 2026-03-02 18:10:02 +00:00

* Give the user the option to override where model weights are stored

* Fix ggml_nbytes() problem and cleanup

For a tensor with zero elements ggml_nbytes() was returning
uint64_t::max, and this was causing graph allocation failure.

* Add timing info to CUDA graph evaluation

* Add more timing info

---------

Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>

This commit is contained in:

Kawrakow

2025-02-25 17:55:58 +02:00

committed by

GitHub

parent 547eee81d9

commit 94b659a2f1

9 changed files with 848 additions and 621 deletions

1302

src/llama.cpp

View File

File diff suppressed because it is too large Load Diff

Give the user the option to override where model weights are stored (#232)

1302 src/llama.cpp View File

1302

src/llama.cpp

View File