heretic

mirror of https://github.com/p-e-w/heretic.git synced 2026-05-19 11:59:04 +00:00

Author	SHA1	Message	Date
Philipp Emanuel Weidmann	bd1fa0ade4	feat(ara): remove `tie_to_original_matrix` term This term was found experimentally to be 3-4 orders of magnitude smaller than the others in most runs, and have no meaningful effect on the result of the optimization.	2026-03-04 09:13:38 +05:30
Philipp Emanuel Weidmann	3c5d6920bf	feat(ara): fix memory leak	2026-03-02 18:50:27 +05:30
Philipp Emanuel Weidmann	b8f4a9c985	feat(ara): implement optimization for ARA parameters	2026-03-02 14:36:57 +05:30
Philipp Emanuel Weidmann	154241f8a2	feat(ara): implement matrix optimization	2026-02-27 14:16:27 +05:30
Philipp Emanuel Weidmann	ea7c59a55a	feat(ara): add methods for obtaining module I/O	2026-02-25 17:44:46 +05:30
Philipp Emanuel Weidmann	27097bfe8e	build: bump version to 1.2.0 v1.2.0	2026-02-14 18:11:42 +05:30
Philipp Emanuel Weidmann	025ab3a881	fix: disable LoRA export for now Workaround for #152	2026-02-14 16:56:12 +05:30
Philipp Emanuel Weidmann	1179013999	docs: update README	2026-02-14 16:32:08 +05:30
Philipp Emanuel Weidmann	fe7bc1bae3	docs: update README	2026-02-14 10:47:28 +05:30
Philipp Emanuel Weidmann	e70a1a85e8	fix: don't load checkpoint when evaluating a second model Fixes #144	2026-02-14 10:02:17 +05:30
Philipp Emanuel Weidmann	e7f8be98b7	fix: only export tokenizer when exporting full model Fixes #143	2026-02-14 09:18:22 +05:30
Philipp Emanuel Weidmann	6017bcd347	fix: use compatible release specifiers for non-dev dependencies Fixes #145 Credit to MuX on Discord for recognizing that this is an issue with Transformers 5	2026-02-13 12:27:57 +05:30
Philipp Emanuel Weidmann	dd0b3a2f69	docs: update README	2026-02-11 11:09:17 +05:30
Philipp Emanuel Weidmann	b873598b77	docs: improve settings documentation	2026-02-11 10:19:05 +05:30
Philipp Emanuel Weidmann	10ceb3098e	chore: update copyright notice	2026-02-11 09:46:36 +05:30
Salman Chishti	745b582414	ci: upgrade GitHub Actions to latest versions (#137 ) Signed-off-by: Salman Muin Kayser Chishti <13schishti@gmail.com>	2026-02-08 16:49:04 +05:30
Salman Chishti	d0e9462fb8	ci: upgrade GitHub Actions for Node 24 compatibility (#136 ) Signed-off-by: Salman Muin Kayser Chishti <13schishti@gmail.com>	2026-02-08 16:48:12 +05:30
Philipp Emanuel Weidmann	f68a887a7b	fix: improve code quality, improve UX, fix small bugs	2026-02-08 13:32:00 +05:30
Philipp Emanuel Weidmann	2690655a83	feat: print memory usage during run	2026-02-02 21:18:01 +05:30
Spiky Moth	3525b1ac22	Implement Magnitude-Preserving Orthogonal Ablation (#52 ) * feat: add support for winsorizing the residuals Adds setting winsorization_quantile, expressed as the quantile to clamp to. - If set to a value below 1, the residuals obtained from evaluating the first token of the good and bad prompts are winsorized - that is, values outside the given quantile are clamped. Note that winsorization_quantile = 0.95 corresponds to a 90% winsorization. * feat: implement magnitude-preserving orthogonal ablation Adds boolean setting orthogonalize_direction: - When enabled, only the component of the refusal directions that is orthogonal to the harmless direction is subtracted during abliteration. Adds enum-valued setting row_normalization: - 'none': No normalization. - 'pre': Row-normalize the weight matrix before computing the LoRA adapter. - 'full': Like 'pre', but re-normalizes to preserve original row magnitudes. * prefer 'good' and 'bad' over 'harmless' and 'harmful' * clarify how winsorization is applied * store and reuse full peft_config * remove unneeded cast * make LoRA rank configurable for full normalization * explain why the singular values are split across the components	2026-02-02 17:05:19 +05:30
anrp	42f5a9b553	fix: Use file instead of symlink lock (for windows) (#116 )	2026-01-25 19:34:01 +05:30
anrp	451db0b76e	fix: specify study name (#119 ) If we don't, optuna will generate a UUID for a name, which will never be found when loading as it is a "different" study. https://optuna.readthedocs.io/en/stable/reference/generated/optuna.study.create_study.html#optuna.study.create_study	2026-01-25 18:48:23 +05:30
anrp	ebc22c299e	feat: Allow study progress to be saved & resumed (#106 ) * feat: Store active study in log/study.jsonl and allow resuming * Simplify resume logic with load_if_exists=True * Significantly improve flexibility of study save/load * Put constructor arguments at the highest precedence * Review comments --------- Co-authored-by: Spiky Moth <spikymoth@pm.me>	2026-01-23 19:49:37 +05:30
anrp	d5c834c51d	fix: Allow abliterating VL models (#108 ) Per https://huggingface.co/docs/transformers/en/model_doc/auto#auto-classes, it indicates that "There is one class of AutoModel for each task." Use the presence of "vision_config" in the config.json to determine which.	2026-01-23 19:34:31 +05:30
anrp	c86f49035e	feat: Refactor save machinery and always allow user to save LoRA (#110 )	2026-01-20 18:53:47 +05:30
anrp	85a6ec5ecb	fix: Include kernels (allows MXFP4 to be loaded in MXFP4 instead of upcasting) (#107 ) Co-authored-by: Andrew Patrikalakis <anrp@tri.global>	2026-01-16 17:30:24 +05:30
Philipp Emanuel Weidmann	632b1da622	feat: add config file for slop reduction	2026-01-11 18:51:26 +05:30
Philipp Emanuel Weidmann	1cfd09d7f3	ci: add style guide for Gemini	2026-01-09 14:58:56 +05:30
Philipp Emanuel Weidmann	09be09e12e	fix: restore classification of empty responses as refusals Fixes #93	2026-01-02 16:50:02 +05:30
Philipp Emanuel Weidmann	039f6222d2	feat: allow overriding the system prompt per dataset	2025-12-31 14:26:44 +05:30
Philipp Emanuel Weidmann	c4b2ea0c42	feat: allow injecting prefixes and suffixes into prompts	2025-12-31 12:00:44 +05:30
Philipp Emanuel Weidmann	02a5237a02	feat: add option to print prompt/response pairs	2025-12-27 14:48:29 +05:30
Philipp Emanuel Weidmann	cf8cf6f349	fix: address remaining ty complaint	2025-12-22 11:12:45 +05:30
Philipp Emanuel Weidmann	2141e110fb	ci: treat ty warnings as errors	2025-12-22 10:57:36 +05:30
Philipp Emanuel Weidmann	39101137ef	ci: add type checking	2025-12-22 10:48:42 +05:30
Philipp Emanuel Weidmann	064bed9a9f	fix: resolve issues raised by ty A single issue has been deliberately left unfixed to verify that the CI check works	2025-12-22 10:24:55 +05:30
_Vinayyyy_	8d44b65670	feat: add continuous optimization option(latest changes updated) (#76 ) * fix: a little merge bug * refactor: simplify optimization loop based on feedback * fix: address review comments * fix: remove redundant check for study.best_trials * fix: restore comments --------- Co-authored-by: Vinay Umrethe <vinayumrethe99@gmail.com>	2025-12-20 18:57:57 +05:30
Philipp Emanuel Weidmann	5ddef6fd2f	feat: add more CoT templates Suggested by u/Chromix_ on Reddit	2025-12-20 17:12:46 +05:30
michaelh	92d0c0d551	feat: enumerate all available GPUs on startup (#86 ) * feat: enumerate all available GPUs on startup * feat: extend device enumeration to all accelerator types	2025-12-16 17:42:15 +05:30
michaelh	243f821d93	feat: Add 4-bit loading + LoRA support for low VRAM optimization (#60 ) * Add files via upload * perf: optimize abliteration matrix op (#46) * perf: optimize abliteration matrix op * refactor: comments and var names correspond with arditi * refactor: fix comments and improve var notation * fix: accidental line change and improve comments --------- Co-authored-by: mad-cat-lon <113548315+mad-cat-lon@users.noreply.github.com> * Fix line endings to LF * Add hybrid approach for GPT-OSS compatibility - Check for LoRA adapters before attempting LoRA abliteration - Fall back to direct weight modification for nn.Parameter (GPT-OSS) - Ensures compatibility across all model architectures * Fix projector bug, update print statement, revert README * Revert README changes to match upstream * Fix import sorting for ruff * Fix reload_model for evaluate_model, add type hints and validation * Apply ruff formatting * Replace load_in_4bit with quantization enum * Fix precision loss: use FP32 refusal direction directly * Move r assignment into non-LoRA path * Fix linting: apply ruff formatting * Add auto-merge for LoRA adapters on save/upload * Fix linting: apply ruff formatting * Implement CPU-based merge for 4-bit models with OOM fallback * Remove use_lora flag (LoRA always on), add user prompt for 4-bit export * Fix: PEFT target_modules expects module names without path prefix * Fix linting: apply ruff formatting * Add LoRA fallback and fix quantization_config handling - Add try/except around LoRA initialization with fallback to direct weight modification - Only pass quantization_config when not None (fixes gpt-oss loading) - Use simple forward pass instead of generate() for model test (avoids chat template issues) - Reset non-LoRA models by reloading in reload_model() - Check self.use_lora before accessing LoRA adapters in abliterate() * Add 8-bit quantization support via bitsandbytes - Add BNB_8BIT option to QuantizationMethod enum - Add --load-in-8bit CLI support (auto via pydantic-settings) - Update documentation in config.py and config.default.toml - Useful for mid-range VRAM (12-16 GB) as balance between memory and numeric stability * Improve LoRA merge warning and fix linting * Apply final ruff formatting * Fix CI: apply ruff import sorting * Use tiny model for CI efficiency * Fix import sorting in test_lora.py * Fix formatting in test_lora.py * feat: Show merge warning for all models (requires high RAM) * style: Apply ruff fixes * Fix undefined Style import in main.py * Fix(model): Support MoE/3D tensors and enforce dtype safety in abliterate * Fix(ci): Format model.py with ruff * Fix(main): Remove invalid style argument from prompt_select and unused import * Fix logic errors, memory leak, and redundant merges in main.py * Fix linting and formatting issues (isort, ruff) * chore: Simplify .gitattributes as requested * refactor: Remove defensive try-except around LoRA initialization * chore: Update uv.lock with peft and bitsandbytes * chore: Regenerate uv.lock to include missing peft dependency * style: Fix import sorting (isort) for CI compliance * style: Simplify .gitattributes to single line as requested * Address PR #60 feedback: Remove caching, fix LoRA reload, global LoRA usage, style fixes * Address PR review comments: clarify code, fix quantization, rename method - Add explanatory comments for warning suppression and gc behavior - Remove redundant gc.collect() calls (empty_cache handles it) - Fix output message order (ask merge strategy before 'Uploading...') - Add comment explaining 8-bit quantization doesn't need compute_dtype - Remove extra newline after dtype comment - Add future-proofing note for hybrid layer support (#43) - Remove leftover comment in get_merged_model - Delete test_lora.py (debug script, not a real test) - Add comment explaining needs_reload flag purpose - Extract quantization config into _get_quantization_config() helper - Rename reload_model() to reset_model_for_trial() for clarity - Fix reload_model to respect quantization config (fixes evaluate_model bug) - Remove unused gc import * Restore gc.collect() before empty_cache() for large models * refactor: Remove LoRA fallback remnants, simplify code - Remove use_lora flag (always true since LoRA is always applied) - Remove isinstance(PeftModel) check in get_merged_model() (always true) - Simplify reset_model_for_trial() by removing defensive try/except - Remove redundant gc.collect() calls (empty_cache handles GC) - Remove unused gc import from main.py * Address p-e-w review feedback: rename reset_model, remove loaded_model_name, fix type hints, remove GPT-OSS MoE, update assertion * Restore skip logic for non-LoRA modules and fix 4-bit base_layer.weight access * Remove defensive lora_A check per review - get_layer_modules already filters * Fix try_add: nest component init inside Module check, add assert for unexpected types * Add note about module.weight assumption for type checking * Change 'Reloading model' to 'Resetting model' in logging --------- Co-authored-by: accemlcc <accemlcc@users.noreply.github.com> Co-authored-by: mad-cat-lon <113548315+mad-cat-lon@users.noreply.github.com> Co-authored-by: Hager <Michael.Hager@bruker.com>	2025-12-14 20:19:09 +05:30
Spiky Moth	9d1734855d	feat: avoid excessive low divergence iteration (#73 ) * feat: adjust scoring to avoid useless iteration Adjusts the scoring function to avoid targeting meaninglessly low KL divergences. Below a threshold value, the KL divergence score switches to the refusal count. Adds config option kl_divergence_target (defaulting to 0.01). * fix: Clean up parameter selection in objective Create variables for num_layers and last_layer_index * Improves readability and makes choices explicit * feat: Print the parameters of the selected model	2025-12-14 14:26:48 +05:30
George	740aab61ba	feat: add max_memory parameter to limit memory usage (#83 ) * add max_memory parameter to limit memory usage * Added to reload_model also * forgot to add self * Process max_memory once in __init__ and store it as an instance variable, then reuse it in both locations	2025-12-11 20:57:40 +05:30
Philipp Emanuel Weidmann	d9f2b0407a	build: bump version to 1.1.0 v1.1.0	2025-12-10 16:54:03 +05:30
Philipp Emanuel Weidmann	ca783db6c9	docs: update README	2025-12-10 16:30:35 +05:30
Philipp Emanuel Weidmann	6acccac994	feat: add progress bars for plotting operations	2025-12-10 13:07:34 +05:30
Philipp Emanuel Weidmann	ac154a55a0	fix: suppress CoT output for thinking models Ref #75	2025-12-09 11:54:08 +05:30
Philipp Emanuel Weidmann	15781a8a0c	fix: skip common response prefix for thinking models Ref #75	2025-12-09 08:25:10 +05:30
Philipp Emanuel Weidmann	24c3aeb442	feat: turn boolean settings into CLI flags	2025-12-07 11:37:07 +05:30
Philipp Emanuel Weidmann	ffbde3ac2a	fix: follow up after recent PRs	2025-12-07 10:26:16 +05:30
Philipp Emanuel Weidmann	932d737edf	feat: add silhouette coefficient to residual geometry output	2025-12-07 08:48:38 +05:30

1 2 3

117 Commits