Support --device and --device-draft parameter (#866)

* add --device and --device-draft parameter

* don't print debug message in release mode

* fix

* bug fix to throw exception when no device specified

* add const

---------

Co-authored-by: firecoperana <firecoperana>
This commit is contained in:
firecoperana
2025-10-27 16:13:28 +00:00
committed by GitHub
parent eb8116b097
commit 904e994bfb
12 changed files with 283 additions and 40 deletions

View File

@@ -71,6 +71,7 @@ int main(int argc, char ** argv) {
ctx_tgt = llama_init_tgt.context;
// load the draft model
params.devices = params.devices_draft;
params.model = params.model_draft;
params.n_gpu_layers = params.n_gpu_layers_draft;
if (params.n_threads_draft > 0) {