Support --device and --device-draft parameter (#866)

* add --device and --device-draft parameter * don't print debug message in release mode * fix * bug fix to throw exception when no device specified * add const --------- Co-authored-by: firecoperana <firecoperana>
2026-03-13 23:40:09 +00:00 · 2025-10-27 16:13:28 +00:00
parent eb8116b097
commit 904e994bfb
12 changed files with 283 additions and 40 deletions
--- a/examples/speculative/speculative.cpp
+++ b/examples/speculative/speculative.cpp
@@ -71,6 +71,7 @@ int main(int argc, char ** argv) {
    ctx_tgt = llama_init_tgt.context;

    // load the draft model
+    params.devices = params.devices_draft;
    params.model = params.model_draft;
    params.n_gpu_layers = params.n_gpu_layers_draft;
    if (params.n_threads_draft > 0) {