batched : add bench tool (#3545)

* batched : add bench tool

* batched : minor fix table

* batched-bench : add readme + n_kv_max is now configurable

* batched-bench : init warm-up batch

* batched-bench : pass custom set of PP, TG and PL

* batched-bench : add mmq CLI arg
This commit is contained in:
Georgi Gerganov
2023-10-11 21:25:33 +03:00
committed by GitHub
parent dcdafa74c6
commit f11fd81fbd
7 changed files with 321 additions and 3 deletions

1
.gitignore vendored
View File

@@ -55,6 +55,7 @@ models-mnt
/server
/simple
/batched
/batched-bench
/export-lora
/finetune
/speculative