mirror of
https://github.com/ikawrakow/ik_llama.cpp.git
synced 2026-02-05 14:00:10 +00:00
* Merging mainline - WIP * Merging mainline - WIP AVX2 and CUDA appear to work. CUDA performance seems slightly (~1-2%) lower as it is so often the case with llama.cpp/ggml after some "improvements" have been made. * Merging mainline - fix Metal * Remove check --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
16 lines
409 B
Markdown
16 lines
409 B
Markdown
# llama.cpp/example/passkey
|
|
|
|
A passkey retrieval task is an evaluation method used to measure a language
|
|
models ability to recall information from long contexts.
|
|
|
|
See the following PRs for more info:
|
|
|
|
- https://github.com/ggerganov/llama.cpp/pull/3856
|
|
- https://github.com/ggerganov/llama.cpp/pull/4810
|
|
|
|
### Usage
|
|
|
|
```bash
|
|
make -j && ./llama-passkey -m ./models/llama-7b-v2/ggml-model-f16.gguf --junk 250
|
|
```
|