ik_llama.cpp/examples/server/server.cpp at 1ec12b8e3b0c96eacf82c10d71ce5f4624c1eadf

mirror of https://github.com/ikawrakow/ik_llama.cpp.git synced 2026-03-02 10:00:07 +00:00

Files

gapeleon 17d101863d server: add dynamic control vector management endpoints (#1223 )

This implements the ability to load, unload, and scale control vectors
(representation engineering) mid-inference, following the existing
task-queue pattern used by LoRA adapters.

New Endpoints:
- GET  /control-vectors
- POST /control-vectors/load
- POST /control-vectors/unload
- POST /control-vectors/apply (handles scaling)

Technical Notes:
- Centralizes vector aggregation logic to share implementation between
  load, unload, and apply tasks.
- Vectors are applied globally to the model context.
- Enforces dimension validation on load to safely reject incompatible
  vectors.

Co-authored-by: Gapeleon <gapeleon@users.noreply.github.com>

2026-02-04 16:07:18 +02:00

85 KiB

Raw Blame History

View Raw

85 KiB Raw Blame History

85 KiB

Raw Blame History