webui update (#1003)

webui: add system message in export conversation, support upload conversation with system message Webui: show upload only when in new conversation Webui: Add model name webui: increase height of chat message window when clicking editing Webui: autoclose settings dialog dropdown and maximze screen width when zoom in webui: fix date issues and add more dates webui: change error to toast.error. server: add n_past and slot_id in props_simple webui: add cache tokens, context and prompt speed in chat webui: modernize ui webui: change welcome message webui: change speed display webui: change run python icon webui: add config to use server defaults for sampler webui: put speed on left and context on right webui: recognize AsciiDoc files as valid text files (#16850) * webui: recognize AsciiDoc files as valid text files * webui: add an updated static webui build * webui: add the updated dependency list * webui: re-add an updated static webui build Add a setting to display message generation statistics (#16901) * feat: Add setting to display message generation statistics * chore: build static webui output webui: add HTML/JS preview support to MarkdownContent with sandboxed iframe (#16757) * webui: add HTML/JS preview support to MarkdownContent with sandboxed iframe dialog Extended MarkdownContent to flag previewable code languages, add a preview button alongside copy controls, manage preview dialog state, and share styling for the new button group Introduced CodePreviewDialog.svelte, a sandboxed iframe modal for rendering HTML/JS previews with consistent dialog controls * webui: fullscreen HTML preview dialog using bits-ui * Update tools/server/webui/src/lib/components/app/misc/CodePreviewDialog.svelte Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * Update tools/server/webui/src/lib/components/app/misc/MarkdownContent.svelte Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * webui: pedantic style tweak for CodePreviewDialog close button * webui: remove overengineered preview language logic * chore: update webui static build --------- Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> webui: auto-refresh /props on inference start to resync model metadata (#16784) * webui: auto-refresh /props on inference start to resync model metadata - Add no-cache headers to /props and /slots - Throttle slot checks to 30s - Prevent concurrent fetches with promise guard - Trigger refresh from chat streaming for legacy and ModelSelector - Show dynamic serverWarning when using cached data * fix: restore proper legacy behavior in webui by using unified /props refresh Updated assistant message bubbles to show each message's stored model when available, falling back to the current server model only when the per-message value is missing When the model selector is disabled, now fetches /props and prioritizes that model name over chunk metadata, then persists it with the streamed message so legacy mode properly reflects the backend configuration * fix: detect first valid SSE chunk and refresh server props once * fix: removed the slots availability throttle constant and state * webui: purge ai-generated cruft * chore: update webui static build feat(webui): improve LaTeX rendering with currency detection (#16508) * webui : Revised LaTeX formula recognition * webui : Further examples containg amounts * webui : vitest for maskInlineLaTeX * webui: Moved preprocessLaTeX to lib/utils * webui: LaTeX in table-cells * chore: update webui build output (use theirs) * webui: backslash in LaTeX-preprocessing * chore: update webui build output * webui: look-behind backslash-check * chore: update webui build output * Apply suggestions from code review Code maintenance (variable names, code formatting, string handling) Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * webui: Moved constants to lib/constants. * webui: package woff2 inside base64 data * webui: LaTeX-line-break in display formula * chore: update webui build output * webui: Bugfix (font embedding) * webui: Bugfix (font embedding) * webui: vite embeds assets * webui: don't suppress 404 (fonts) * refactor: KaTeX integration with SCSS Moves KaTeX styling to SCSS for better customization and font embedding. This change includes: - Adding `sass` as a dev dependency. - Introducing a custom SCSS file to override KaTeX variables and disable TTF/WOFF fonts, relying solely on WOFF2 for embedding. - Adjusting the Vite configuration to resolve `katex-fonts` alias and inject SCSS variables. * fix: LaTeX processing within blockquotes * webui: update webui build output --------- Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> server : add props.model_alias (#16943) * server : add props.model_alias webui: fix keyboard shortcuts for new chat & edit chat title (#17007) Better UX for handling multiple attachments in WebUI (#17246) webui: add OAI-Compat Harmony tool-call streaming visualization and persistence in chat UI (#16618) * webui: add OAI-Compat Harmony tool-call live streaming visualization and persistence in chat UI - Purely visual and diagnostic change, no effect on model context, prompt construction, or inference behavior - Captured assistant tool call payloads during streaming and non-streaming completions, and persisted them in chat state and storage for downstream use - Exposed parsed tool call labels beneath the assistant's model info line with graceful fallback when parsing fails - Added tool call badges beneath assistant responses that expose JSON tooltips and copy their payloads when clicked, matching the existing model badge styling - Added a user-facing setting to toggle tool call visibility to the Developer settings section directly under the model selector option * webui: remove scroll listener causing unnecessary layout updates (model selector) * Update tools/server/webui/src/lib/components/app/chat/ChatMessages/ChatMessageAssistant.svelte Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * Update tools/server/webui/src/lib/components/app/chat/ChatMessages/ChatMessageAssistant.svelte Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * chore: npm run format & update webui build output * chore: update webui build output --------- Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> webui: Fix clickability around chat processing statistics UI (#17278) * fix: Better pointer events handling in chat processing info elements * chore: update webui build output Fix merge error webui: Add a "Continue" Action for Assistant Message (#16971) * feat: Add "Continue" action for assistant messages * feat: Continuation logic & prompt improvements * chore: update webui build output * feat: Improve logic for continuing the assistant message * chore: update webui build output * chore: Linting * chore: update webui build output * fix: Remove synthetic prompt logic, use the prefill feature by sending the conversation payload ending with assistant message * chore: update webui build output * feat: Enable "Continue" button based on config & non-reasoning model type * chore: update webui build output * chore: Update packages with `npm audit fix` * fix: Remove redundant error * chore: update webui build output * chore: Update `.gitignore` * fix: Add missing change * feat: Add auto-resizing for Edit Assistant/User Message textareas * chore: update webui build output Improved file naming & structure for UI components (#17405) * refactor: Component iles naming & structure * chore: update webui build output * refactor: Dialog titles + components namig * chore: update webui build output * refactor: Imports * chore: update webui build output webui: hide border of button webui: update webui: update webui: update add vision webui: minor settings reorganization and add disable autoscroll option (#17452) * webui: added a dedicated 'Display' settings section that groups visualization options * webui: added a Display setting to toggle automatic chat scrolling * chore: update webui build output Co-authored-by: firecoperana <firecoperana>
2026-02-01 20:19:52 +00:00 · 2025-11-24 00:03:45 -06:00
parent 920f424929
commit 07d08e15ad
91 changed files with 13517 additions and 11849 deletions
--- a/examples/server/server.cpp
+++ b/examples/server/server.cpp
@@ -108,6 +108,8 @@ struct result_timings {
    double predicted_ms;
    double predicted_per_token_ms;
    double predicted_per_second;
+    int32_t n_ctx = 0;
+    int32_t n_past = 0;

    // Optional speculative metrics - only included when > 0
    int32_t draft_n = 0;
@@ -124,6 +126,9 @@ struct result_timings {
            {"predicted_ms",           predicted_ms},
            {"predicted_per_token_ms", predicted_per_token_ms},
            {"predicted_per_second",   predicted_per_second},
+
+            {"n_ctx",           n_ctx},
+            {"n_past",           n_past},
        };

        if (draft_n > 0) {
@@ -585,6 +590,13 @@ struct slot_params {
 };


+inline std::string get_model_name(std::string path)
+{
+    std::string filename = path.substr(path.find_last_of("/\\") + 1);
+    return filename;
+};
+
+
 struct server_prompt_checkpoint {
    llama_pos pos_min;
    llama_pos pos_max;
@@ -988,6 +1000,9 @@ struct server_slot {
            {"predicted_ms",           t_token_generation},
            {"predicted_per_token_ms", t_token_generation / n_decoded},
            {"predicted_per_second",   1e3 / t_token_generation * n_decoded},
+
+            {"n_ctx",           n_ctx},
+            {"n_past",           n_past},
        };
    }

@@ -1003,6 +1018,10 @@ struct server_slot {
        timings.predicted_per_token_ms = t_token_generation / n_decoded;
        timings.predicted_per_second = 1e3 / t_token_generation * n_decoded;

+        timings.n_ctx = n_ctx;
+        timings.n_past = n_past;
+
+
        // Add speculative metrics
        if (n_draft_total > 0) {
            timings.draft_n = n_draft_total;
@@ -4651,8 +4670,11 @@ int main(int argc, char ** argv) {
        }
        json data = {
            { "system_prompt",               ctx_server.system_prompt.c_str() },
+            { "model_alias",                 ctx_server.params.model_alias },
+            { "model_path",                  ctx_server.params.model},
            { "default_generation_settings", ctx_server.default_generation_settings_for_props },
            { "total_slots",                 ctx_server.params.n_parallel },
+            { "model_name",                  get_model_name(ctx_server.params.model)},
            { "chat_template",               common_chat_templates_source(ctx_server.chat_templates.get()) },
            { "bos_token",                   llama_token_to_piece(ctx_server.ctx, llama_token_bos(ctx_server.model), /* special= */ true)},
            { "eos_token",                   llama_token_to_piece(ctx_server.ctx, llama_token_eos(ctx_server.model), /* special= */ true)},
@@ -4673,6 +4695,28 @@ int main(int argc, char ** argv) {
        res.set_content(data.dump(), "application/json; charset=utf-8");
    };

+    const auto handle_props_simple = [&ctx_server](const httplib::Request& req, httplib::Response& res) {
+        res.set_header("Access-Control-Allow-Origin", req.get_header_value("Origin"));
+        int n_past = 0;
+        int slot_id = 0;
+        for (server_slot& slot : ctx_server.slots) {
+            if (slot.n_past > n_past) {
+                n_past = slot.n_past;
+                slot_id = slot.id;
+            }
+        }
+        json data = {
+            { "model_name",                  get_model_name(ctx_server.params.model)},
+            { "model_path",                  ctx_server.params.model },
+            { "modalities",                  json {
+                {"vision", ctx_server.oai_parser_opt.allow_image},
+                {"audio",  ctx_server.oai_parser_opt.allow_audio},
+            } },
+             { "n_ctx",                       ctx_server.n_ctx }
+        };
+        res.set_content(data.dump(), "application/json; charset=utf-8");
+    };
+

    // handle completion-like requests (completion, chat, infill)
    // we can optionally provide a custom format for partial results and final results
@@ -5411,6 +5455,7 @@ int main(int argc, char ** argv) {
    svr->Get ("/health",              handle_health);
    svr->Get ("/metrics",             handle_metrics);
    svr->Get ("/props",               handle_props);
+    svr->Get("/v1/props",             handle_props_simple);
    svr->Get ("/v1/models",           handle_models);
    svr->Post("/completion",          handle_completions); // legacy
    svr->Post("/completions", handle_completions); // legacy