ai-toolkit

mirror of https://github.com/ostris/ai-toolkit.git synced 2026-05-11 16:30:40 +00:00

Author	SHA1	Message	Date
Jaret Burkett	f38de2a2fe	Add tipsv2 locally and fix gradient checkpointing for it	2026-05-10 14:47:44 -06:00
Jaret Burkett	a12ddd72a1	Change the velocity weight cap on dfe 9	2026-05-07 07:37:05 -06:00
Jaret Burkett	963a9f42b2	Add decode latent to wan 2.1 models. Add gradinet checkpointing to wan vae.	2026-05-05 11:30:16 -06:00
Jaret Burkett	acc6a36214	Scale DFE 9 to a velocity equiv weight to match flow matching gradient strength. Probably need to rework all DFEs to do this as the math checks out.	2026-04-28 09:10:02 -06:00
Jaret Burkett	1fc4ad3979	Add sapiens2 as a diffusion feature extractor	2026-04-27 15:59:03 -06:00
Jaret Burkett	488878f354	Use hidden layers in the loss for DFE 7 and 8	2026-04-18 13:07:38 -06:00
Jaret Burkett	beb40ae29b	Add DFE8 with partial step	2026-04-17 17:40:16 -06:00
Jaret Burkett	ab1ee4df34	Hotfix some issues with Wan models caused by diffusers and transformers updates	2026-04-16 20:53:50 +00:00
Jaret Burkett	233e292256	Added some experimental low step things for zeta	2026-04-13 09:37:34 -06:00
Jaret Burkett	78cf049c29	Add support for ACE-Step 1.5 and ACE-Step 1.5 XL. Also added dataset captioning through the UI. (#785 ) * Base ace step 1.5 xl added. Generating, still wip on training and ui * Base training code done * Fix some issues with caching text embeddings. Update sample cards to show audio * Fix issue with quantizing ace step * Add album artwork to samples with waveform. * Cleanup logs * Add album art endpoint to speed up album art loading * Made an make video with artwork script * Make ui handle basic audio models. Make multi line adjustments to the editor and better syntax hilighting. * Add prompt tagging system for special tagged models. * prompt tagging processing for ui working. * Moved default samples to a special file so we can add more when needed and they can be adjusted for a specific model * Add a captioner job with music captioner that is prepped for use with the ui * Add basit ui setup for captioning modal and handeling captioning jobs * Starting captioning job from ui working. Still better management for it. * Better filtering of job options in the job view for captioning jobs * Added qwen3 vl as a captioner for images * Have an indicator when a dataset is being captioned. * Adjust the way caption jobs look in the queue * Fix a few issues. Adjust defaults. * Version bump * Added ace step to the readme.	2026-04-09 15:02:03 -06:00
Jaret Burkett	171535833a	Add Mac OS support for Apple Silicon (#770 ) * Made an install script and auto updates env for mac * GPU sensors and initial training working for MAC. Still WIP. * Switch dataloader to single threaded until I can work around some mac pickeling issues. * Get quantization working on mac * Fix mac exclusive imports so they don't break other builds. * Add mac instructions to the UI	2026-03-30 09:37:47 -06:00
Jaret Burkett	f85bf065bf	Use pooler embeddings for DFE v6 with dino v3	2026-03-27 07:02:07 -06:00
Jaret Burkett	5642b656b9	Fix audio issues with ltx2 models. Silent codec fails now raised. Auto convert surround sound audio to stereo. Invalidate old caches just to be safe so they recache now.	2026-03-23 20:08:33 +00:00
Jaret Burkett	b04c64e0f8	Add a dino version of DFE	2026-03-04 08:20:37 -07:00
Jaret Burkett	3632656cda	make DFE work with more VAEs	2026-02-18 09:46:37 -07:00
Jaret Burkett	115f0a3670	Fixed error with wan models when caching text embeddings	2026-02-06 14:26:53 -07:00
Jaret Burkett	1ce2428722	Shrink text embeds to max token length for LTX-2. Drastically reduces cached text embedding sizes	2026-01-28 12:54:49 -07:00
Jaret Burkett	73dedbf662	Do caching of latents, first frame and audio when caching latents for LTX2	2026-01-14 11:05:23 -07:00
Jaret Burkett	5b5aadadb8	Add LTX-2 Support (#644 ) * WIP, adding support for LTX2 * Training on images working * Fix loading comfy models * Handle converting and deconverting lora so it matches original format * Reworked ui to habdle ltx and propert dataset default overwriting. * Update the way lokr saves to it is more compatable with comfy * Audio loading and synchronization/resampling is working * Add audio to training. Does it work? Maybe, still testing. * Fixed fps default issue for sound * Have ui set fps for accurate audio mapping on ltx * Added audio procession options to the ui for ltx * Clean up requirements	2026-01-13 04:55:30 -07:00
Jaret Burkett	d42f5af2fc	Fixed issue with DOP when using Z-Image	2025-11-28 09:36:21 -07:00
Jaret Burkett	4e62c38df5	Add support for training Z-Image Turbo with a de-distill training adapter	2025-11-28 08:08:53 -07:00
Jaret Burkett	ff14cd6343	Fix check for making sure vae is on the right device.	2025-10-21 14:49:20 -06:00
Jaret Burkett	76ce757e0c	Added initial support for layer offloading wit Wan 2.2 14B models.	2025-10-20 14:54:30 -06:00
Jaret Burkett	dc1cc3e78a	Fixed issue where multi control samples didnt work when not caching	2025-10-05 14:38:53 -06:00
Jaret Burkett	4e5707854f	Initial support for RamTorch. Still a WIP	2025-10-05 13:03:26 -06:00
Jaret Burkett	3086a58e5b	git status	2025-10-01 14:12:17 -06:00
Jaret Burkett	454be0958a	Initial support for qwen image edit plus	2025-09-24 11:39:10 -06:00
Jaret Burkett	f74475161e	Add stepped loss type	2025-09-22 15:50:12 -06:00
Jaret Burkett	28728a1e92	Added experimental dfe 5	2025-09-21 10:48:52 -06:00
Jaret Burkett	3666b112a8	DEF for fake vae and adjust scaling	2025-09-12 18:09:08 -06:00
Jaret Burkett	b95c17dc17	Add initial support for chroma radiance	2025-09-10 08:41:05 -06:00
Jaret Burkett	1f541bc5d8	Changes to handle a different DFE arch	2025-08-27 11:05:16 -06:00
Jaret Burkett	bf2700f7be	Initial support for finetuning qwen image. Will only work with caching for now, need to add controls everywhere.	2025-08-21 16:41:17 -06:00
Jaret Burkett	8ea2cf00f6	Added training to the ui. Still testing, but everything seems to be working.	2025-08-16 05:51:37 -06:00
Jaret Burkett	3413fa537f	Wan22 14b training is working, still need tons of testing and some bug fixes	2025-08-14 13:03:27 -06:00
Jaret Burkett	be71cc75ce	Switch to unified text encoder for wan models. Pred for 2.2 14b	2025-08-14 10:07:18 -06:00
Jaret Burkett	77b10d884d	Add support for training with an accuracy recovery adapter with qwen image	2025-08-12 08:21:36 -06:00
Jaret Burkett	bb6db3d635	Added support for caching text embeddings. This is just initial support and will probably fail for some models. Still needs to be ompimized	2025-08-07 10:27:55 -06:00
Jaret Burkett	9dfb614755	Initial work for training wan first and last frame	2025-08-04 11:37:26 -06:00
Jaret Burkett	ca7c5c950b	Add support for Wan2.2 5B	2025-07-29 05:31:54 -06:00
Jaret Burkett	cefa2ca5fe	Added initial support for Hidream E1 training	2025-07-27 15:12:56 -06:00
Jaret Burkett	c5eb763342	Improvements to VAE trainer. Allow CLIP loss.	2025-07-24 06:50:56 -06:00
Jaret Burkett	1930c3edea	Fix naming with wan i2v new keys in lora	2025-07-14 07:34:01 -06:00
Jaret Burkett	755f0e207c	Fix issue with wan i2v scaling. Adjust aggressive loader to be compatable with updated diffusers.	2025-07-12 16:56:27 -06:00
Jaret Burkett	8d9c47316a	Work on mean flow. Minor bug fixes. Omnigen improvements	2025-06-26 13:46:20 -06:00
Jaret Burkett	03bc431279	Fixed an issue training lumina 2	2025-06-24 10:29:47 -06:00
Jaret Burkett	ba1274d99e	Added a guidance burning loss. Modified DFE to work with new model. Bug fixes	2025-06-23 08:38:27 -06:00
Jaret Burkett	8602470952	Updated diffusion feature extractor	2025-06-19 15:36:10 -06:00
Jaret Burkett	1c2b7298dd	More work on mean flow loss. Moved it to an adapter. Still not functioning properly though.	2025-06-16 07:17:35 -06:00
Jaret Burkett	c0314ba325	Fixed some issues with training mean flow algo. Still testing WIP	2025-06-16 07:14:59 -06:00

1 2 3 4

188 Commits