Load all MoE experts during warmup and make warmup 1 token (#198)

* Load all MoE experts during warmup

Co-authored-by: Stanisław Szymczyk <sszymczy@gmail.com>

* Unify warmup to one token

---------

Co-authored-by: Stanisław Szymczyk <sszymczy@gmail.com>
This commit is contained in:
saood06
2025-02-10 09:40:38 -06:00
committed by GitHub
parent c12f73ba61
commit a366a3d17d
3 changed files with 17 additions and 10 deletions

View File

@@ -1586,7 +1586,7 @@ int main(int argc, char ** argv) {
if (params.warmup) {
if (t.n_prompt > 0) {
//test_prompt(ctx, std::min(t.n_batch, std::min(t.n_prompt, 32)), 0, t.n_batch, t.n_threads);
test_prompt(ctx, t.n_prompt, 0, t.n_batch, t.n_threads);
test_prompt(ctx, 1, 0, t.n_batch, t.n_threads);
}
if (t.n_gen > 0) {
test_gen(ctx, 1, 0, t.n_threads);