Support --device and --device-draft parameter (#866)

* add --device and --device-draft parameter

* don't print debug message in release mode

* fix

* bug fix to throw exception when no device specified

* add const

---------

Co-authored-by: firecoperana <firecoperana>
This commit is contained in:
firecoperana
2025-10-27 16:13:28 +00:00
committed by GitHub
parent bdf4f0ddce
commit 6dc5bd847b
12 changed files with 283 additions and 40 deletions

View File

@@ -12,6 +12,9 @@ struct llama_cparams {
uint32_t n_threads; // number of threads to use for generation
uint32_t n_threads_batch; // number of threads to use for batch processing
std::vector<std::string> devices;
std::vector<std::string> devices_draft;
float rope_freq_base;
float rope_freq_scale;