mirror of
https://github.com/ikawrakow/ik_llama.cpp.git
synced 2026-02-20 21:24:08 +00:00
without my change | PP | TG | N_KV | T_PP s | S_PP t/s | T_TG s | S_TG t/s | | --- | --- | ---- | ------ | -------- | ------ | -------- | ggml_backend_cuda_graph_compute: disabling CUDA graphs due to mul_mat_id ggml_backend_cuda_graph_compute: disabling CUDA graphs due to too many consecutive updates | 8192 | 2048 | 0 | 54.433 | 150.50 | 414.061 | 4.95 | | 8192 | 2048 | 8192 | 64.162 | 127.68 | 428.767 | 4.78 | after my change to CMakeLists.txt | PP | TG | N_KV | T_PP s | S_PP t/s | T_TG s | S_TG t/s | |-------|--------|--------|----------|----------|----------|----------| | 8192 | 2048 | 0 | 58.363 | 140.36 | 405.040 | 5.06 | | 8192 | 2048 | 8192 | 63.752 | 128.50 | 423.548 | 4.84 | | 8192 | 2048 | 16384 | 69.712 | 117.51 | 431.367 | 4.75 |