Commit Graph

1342 Commits

Author SHA1 Message Date
turboderp
4f83f52d7d Merge branch 'refs/heads/dev' v0.2.6 2024-12-07 15:56:16 +01:00
turboderp
15b5df784a Cleanup build actions 2024-12-07 15:55:53 +01:00
turboderp
ebaf819bc0 Merge remote-tracking branch 'origin/dev' into dev 2024-12-07 15:55:33 +01:00
turboderp
83a57c74ed Bump to v0.2.6 2024-12-07 15:55:11 +01:00
turboderp
ba9774f1c8 Enable noise tokens for Qwen2-VL quantizatino 2024-12-07 15:53:52 +01:00
turboderp
c55656cc0c Fix system RAM consumption while quantizing, fixes #692 2024-12-05 21:16:36 +01:00
turboderp
c86f62c3b8 Ensure MRoPE ID tensor is contiguous 2024-12-05 18:02:02 +01:00
Philipp Emanuel Weidmann
db78601226 Prevent NPE in deallocate_pages (#688)
Prevent NPE in `deallocate_pages`

If `deallocate_pages` is called on a job for which `allocate_pages`
has not been called (see `iterate_start_jobs` for conditions under
which this is true), `allocated_pages` is `None`, raising a NPE
when attempting to iterate.

In particular, this prevents `clear_queue` from working. In
practice, this problem readily occurs when starting a few jobs
and then calling `clear_queue`.
2024-12-01 22:02:32 +01:00
turboderp
663eea1b53 Fix 64-bit dtype for MSVC 2024-12-01 20:09:40 +01:00
turboderp
bc7db9395d Merge remote-tracking branch 'origin/master' v0.2.5 2024-12-01 14:29:44 +01:00
turboderp
e3b5549e0b Bump to v0.2.5 2024-12-01 14:21:59 +01:00
turboderp
fa7e89c197 Update example 2024-12-01 14:20:33 +01:00
turboderp
48e6306193 Update chat example, prompt formats 2024-11-30 13:31:35 +01:00
turboderp
1f685bd8d3 Update grounding demo 2024-11-23 14:46:51 +01:00
turboderp
638cf3015f Add Qwen2-VL grounding demo 2024-11-23 12:19:02 +01:00
turboderp
bfa4b4f043 Don't clamp FP32 residual during quantization 2024-11-22 09:30:36 +01:00
turboderp
142190e1f8 DRY: Avoid out-of-bounds error when computing penalty for sequence with image tokens 2024-11-22 02:13:26 +01:00
turboderp
9cacd66229 Fix MRoPE model inference when no MM embeddings present 2024-11-20 05:49:03 +01:00
turboderp
69bb9d6cff Add optional noise embeddings during quantization 2024-11-20 05:48:22 +01:00
turboderp
5857ea9846 MLP: Fix temp state size calculation (for Qwen2-VL-72B mmp) 2024-11-18 16:31:57 +01:00
turboderp
c16aa9b3eb Update multimodal example 2024-11-18 07:51:41 +01:00
turboderp
c81603c441 Update multimodal example 2024-11-18 07:16:48 +01:00
turboderp
6aab7064e2 Support MRoPE (dynamic gen prompt ingest only) 2024-11-18 05:33:11 +01:00
turboderp
b1e786cee3 Fix regression 2024-11-18 04:25:42 +01:00
turboderp
d6177d568f Add grid def etc. MM embedding, keep embeddings in system RAM by default 2024-11-18 03:31:50 +01:00
turboderp
4d258742ed Refactor RoPE initialization 2024-11-18 03:24:26 +01:00
turboderp
be3eeb403d Add Qwen2-VL arch definition, preprocessor and vision tower 2024-11-16 11:36:42 +01:00
turboderp
2ac584cb24 MLP: Support quick_gelu in Torch fwd 2024-11-16 11:36:42 +01:00
turboderp
70bdb0969a MLP: Group input states 2024-11-16 11:36:42 +01:00
turboderp
6405225582 Refactoring 2024-11-16 11:36:42 +01:00
turboderp
5129f96231 Support Conv3d 2024-11-16 11:26:40 +01:00
turboderp
1776223296 Pixtral: Fix hardcoded device ID 2024-11-16 11:23:27 +01:00
turboderp
576303a152 Fix unload for Conv2D 2024-11-16 04:46:27 +01:00
turboderp
735fa7b4c3 Update build action 2024-11-13 05:56:17 +01:00
turboderp
6f9a697b6a Update build action 2024-11-12 09:32:50 +01:00
turboderp
c708e4fd84 BUild actions 2024-11-12 07:14:23 +01:00
turboderp
9961dbdcaf Bump to 0.2.4 v0.2.4 2024-11-12 04:13:33 +01:00
turboderp
2a888dbd47 Pixtral example 2024-11-12 03:46:29 +01:00
turboderp
16cd5ef384 Generator: Make sampler settings optional instead of default arg 2024-11-12 03:41:59 +01:00
turboderp
90895967b1 Fix quantization for Pixtral, copy vision tower tensors to quantized model 2024-11-10 16:22:57 +01:00
turboderp
d37cf7e764 Fix regressions 2024-11-10 13:38:21 +01:00
turboderp
b28300c0db Pixtral: Refactor vision model, update example 2024-11-10 12:34:42 +01:00
turboderp
7c876ef091 Update Pixtral experiment 2024-11-10 11:17:21 +01:00
turboderp
193a6b2b36 Pixtral: Add vision tower and preprocessor 2024-11-10 11:15:06 +01:00
turboderp
9504b515f7 Formatting 2024-11-10 11:13:49 +01:00
turboderp
a2f0f87713 Pixtral: Load vision tower and preprocessor config 2024-11-10 10:42:08 +01:00
turboderp
26406f9360 Make attn keys mappable, switch attn/MLP shapes for vision model 2024-11-10 10:40:58 +01:00
turboderp
79ca8fb65b Add alt. RoPE sin/cos table as attn parameter, and non-causal option 2024-11-10 10:35:49 +01:00
turboderp
c5a21bccb7 Add Conv2D module 2024-11-10 10:32:08 +01:00
turboderp
525b3204e0 Fix PIL dependency, skip version check in preprocessor 2024-11-10 10:31:21 +01:00