webui update (#1003)

webui: add system message in export conversation, support upload conversation with system message
Webui: show upload only when in new conversation
Webui: Add model name
webui: increase height of chat message window when clicking editing
Webui: autoclose settings dialog dropdown and maximze screen width when zoom in
webui: fix date issues and add more dates
webui: change error to toast.error.
server: add n_past and slot_id in props_simple
webui: add cache tokens, context and prompt speed in chat
webui: modernize ui
webui: change welcome message
webui: change speed display
webui: change run python icon
webui: add config to use server defaults for sampler
webui: put speed on left and context on right

webui: recognize AsciiDoc files as valid text files (#16850)

* webui: recognize AsciiDoc files as valid text files

* webui: add an updated static webui build

* webui: add the updated dependency list

* webui: re-add an updated static webui build

Add a setting to display message generation statistics (#16901)

* feat: Add setting to display message generation statistics

* chore: build static webui output

webui: add HTML/JS preview support to MarkdownContent with sandboxed iframe (#16757)

* webui: add HTML/JS preview support to MarkdownContent with sandboxed iframe dialog

Extended MarkdownContent to flag previewable code languages,
add a preview button alongside copy controls, manage preview
dialog state, and share styling for the new button group

Introduced CodePreviewDialog.svelte, a sandboxed iframe modal
for rendering HTML/JS previews with consistent dialog controls

* webui: fullscreen HTML preview dialog using bits-ui

* Update tools/server/webui/src/lib/components/app/misc/CodePreviewDialog.svelte

Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>

* Update tools/server/webui/src/lib/components/app/misc/MarkdownContent.svelte

Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>

* webui: pedantic style tweak for CodePreviewDialog close button

* webui: remove overengineered preview language logic

* chore: update webui static build

---------

Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>

webui: auto-refresh /props on inference start to resync model metadata (#16784)

* webui: auto-refresh /props on inference start to resync model metadata

- Add no-cache headers to /props and /slots
- Throttle slot checks to 30s
- Prevent concurrent fetches with promise guard
- Trigger refresh from chat streaming for legacy and ModelSelector
- Show dynamic serverWarning when using cached data

* fix: restore proper legacy behavior in webui by using unified /props refresh

Updated assistant message bubbles to show each message's stored model when available,
falling back to the current server model only when the per-message value is missing

When the model selector is disabled, now fetches /props and prioritizes that model name
over chunk metadata, then persists it with the streamed message so legacy mode properly
reflects the backend configuration

* fix: detect first valid SSE chunk and refresh server props once

* fix: removed the slots availability throttle constant and state

* webui: purge ai-generated cruft

* chore: update webui static build

feat(webui): improve LaTeX rendering with currency detection (#16508)

* webui : Revised LaTeX formula recognition

* webui : Further examples containg amounts

* webui : vitest for maskInlineLaTeX

* webui: Moved preprocessLaTeX to lib/utils

* webui: LaTeX in table-cells

* chore: update webui build output (use theirs)

* webui: backslash in LaTeX-preprocessing

* chore: update webui build output

* webui: look-behind backslash-check

* chore: update webui build output

* Apply suggestions from code review

Code maintenance (variable names, code formatting, string handling)

Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>

* webui: Moved constants to lib/constants.

* webui: package woff2 inside base64 data

* webui: LaTeX-line-break in display formula

* chore: update webui build output

* webui: Bugfix (font embedding)

* webui: Bugfix (font embedding)

* webui: vite embeds assets

* webui: don't suppress 404 (fonts)

* refactor: KaTeX integration with SCSS

Moves KaTeX styling to SCSS for better customization and font embedding.

This change includes:
- Adding `sass` as a dev dependency.
- Introducing a custom SCSS file to override KaTeX variables and disable TTF/WOFF fonts, relying solely on WOFF2 for embedding.
- Adjusting the Vite configuration to resolve `katex-fonts` alias and inject SCSS variables.

* fix: LaTeX processing within blockquotes

* webui: update webui build output

---------

Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>

server : add props.model_alias (#16943)

* server : add props.model_alias

webui: fix keyboard shortcuts for new chat & edit chat title (#17007)

Better UX for handling multiple attachments in WebUI (#17246)

webui: add OAI-Compat Harmony tool-call streaming visualization and persistence in chat UI (#16618)

* webui: add OAI-Compat Harmony tool-call live streaming visualization and persistence in chat UI

- Purely visual and diagnostic change, no effect on model context, prompt
  construction, or inference behavior

- Captured assistant tool call payloads during streaming and non-streaming
  completions, and persisted them in chat state and storage for downstream use

- Exposed parsed tool call labels beneath the assistant's model info line
  with graceful fallback when parsing fails

- Added tool call badges beneath assistant responses that expose JSON tooltips
  and copy their payloads when clicked, matching the existing model badge styling

- Added a user-facing setting to toggle tool call visibility to the Developer
  settings section directly under the model selector option

* webui: remove scroll listener causing unnecessary layout updates (model selector)

* Update tools/server/webui/src/lib/components/app/chat/ChatMessages/ChatMessageAssistant.svelte

Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>

* Update tools/server/webui/src/lib/components/app/chat/ChatMessages/ChatMessageAssistant.svelte

Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>

* chore: npm run format & update webui build output

* chore: update webui build output

---------

Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>

webui: Fix clickability around chat processing statistics UI (#17278)

* fix: Better pointer events handling in chat processing info elements

* chore: update webui build output

Fix merge error

webui: Add a "Continue" Action for Assistant Message (#16971)

* feat: Add "Continue" action for assistant messages

* feat: Continuation logic & prompt improvements

* chore: update webui build output

* feat: Improve logic for continuing the assistant message

* chore: update webui build output

* chore: Linting

* chore: update webui build output

* fix: Remove synthetic prompt logic, use the prefill feature by sending the conversation payload ending with assistant message

* chore: update webui build output

* feat: Enable "Continue" button based on config & non-reasoning model type

* chore: update webui build output

* chore: Update packages with `npm audit fix`

* fix: Remove redundant error

* chore: update webui build output

* chore: Update `.gitignore`

* fix: Add missing change

* feat: Add auto-resizing for Edit Assistant/User Message textareas

* chore: update webui build output

Improved file naming & structure for UI components (#17405)

* refactor: Component iles naming & structure

* chore: update webui build output

* refactor: Dialog titles + components namig

* chore: update webui build output

* refactor: Imports

* chore: update webui build output

webui: hide border of button

webui: update

webui: update

webui: update

add vision

webui: minor settings reorganization and add disable autoscroll option (#17452)

* webui: added a dedicated 'Display' settings section that groups visualization options

* webui: added a Display setting to toggle automatic chat scrolling

* chore: update webui build output

Co-authored-by: firecoperana <firecoperana>
This commit is contained in:
firecoperana
2025-11-24 00:03:45 -06:00
committed by GitHub
parent 920f424929
commit 07d08e15ad
91 changed files with 13517 additions and 11849 deletions

View File

@@ -108,6 +108,8 @@ struct result_timings {
double predicted_ms;
double predicted_per_token_ms;
double predicted_per_second;
int32_t n_ctx = 0;
int32_t n_past = 0;
// Optional speculative metrics - only included when > 0
int32_t draft_n = 0;
@@ -124,6 +126,9 @@ struct result_timings {
{"predicted_ms", predicted_ms},
{"predicted_per_token_ms", predicted_per_token_ms},
{"predicted_per_second", predicted_per_second},
{"n_ctx", n_ctx},
{"n_past", n_past},
};
if (draft_n > 0) {
@@ -585,6 +590,13 @@ struct slot_params {
};
inline std::string get_model_name(std::string path)
{
std::string filename = path.substr(path.find_last_of("/\\") + 1);
return filename;
};
struct server_prompt_checkpoint {
llama_pos pos_min;
llama_pos pos_max;
@@ -988,6 +1000,9 @@ struct server_slot {
{"predicted_ms", t_token_generation},
{"predicted_per_token_ms", t_token_generation / n_decoded},
{"predicted_per_second", 1e3 / t_token_generation * n_decoded},
{"n_ctx", n_ctx},
{"n_past", n_past},
};
}
@@ -1003,6 +1018,10 @@ struct server_slot {
timings.predicted_per_token_ms = t_token_generation / n_decoded;
timings.predicted_per_second = 1e3 / t_token_generation * n_decoded;
timings.n_ctx = n_ctx;
timings.n_past = n_past;
// Add speculative metrics
if (n_draft_total > 0) {
timings.draft_n = n_draft_total;
@@ -4651,8 +4670,11 @@ int main(int argc, char ** argv) {
}
json data = {
{ "system_prompt", ctx_server.system_prompt.c_str() },
{ "model_alias", ctx_server.params.model_alias },
{ "model_path", ctx_server.params.model},
{ "default_generation_settings", ctx_server.default_generation_settings_for_props },
{ "total_slots", ctx_server.params.n_parallel },
{ "model_name", get_model_name(ctx_server.params.model)},
{ "chat_template", common_chat_templates_source(ctx_server.chat_templates.get()) },
{ "bos_token", llama_token_to_piece(ctx_server.ctx, llama_token_bos(ctx_server.model), /* special= */ true)},
{ "eos_token", llama_token_to_piece(ctx_server.ctx, llama_token_eos(ctx_server.model), /* special= */ true)},
@@ -4673,6 +4695,28 @@ int main(int argc, char ** argv) {
res.set_content(data.dump(), "application/json; charset=utf-8");
};
const auto handle_props_simple = [&ctx_server](const httplib::Request& req, httplib::Response& res) {
res.set_header("Access-Control-Allow-Origin", req.get_header_value("Origin"));
int n_past = 0;
int slot_id = 0;
for (server_slot& slot : ctx_server.slots) {
if (slot.n_past > n_past) {
n_past = slot.n_past;
slot_id = slot.id;
}
}
json data = {
{ "model_name", get_model_name(ctx_server.params.model)},
{ "model_path", ctx_server.params.model },
{ "modalities", json {
{"vision", ctx_server.oai_parser_opt.allow_image},
{"audio", ctx_server.oai_parser_opt.allow_audio},
} },
{ "n_ctx", ctx_server.n_ctx }
};
res.set_content(data.dump(), "application/json; charset=utf-8");
};
// handle completion-like requests (completion, chat, infill)
// we can optionally provide a custom format for partial results and final results
@@ -5411,6 +5455,7 @@ int main(int argc, char ** argv) {
svr->Get ("/health", handle_health);
svr->Get ("/metrics", handle_metrics);
svr->Get ("/props", handle_props);
svr->Get("/v1/props", handle_props_simple);
svr->Get ("/v1/models", handle_models);
svr->Post("/completion", handle_completions); // legacy
svr->Post("/completions", handle_completions); // legacy