turboderp
4f83f52d7d
Merge branch 'refs/heads/dev'
v0.2.6
2024-12-07 15:56:16 +01:00
turboderp
15b5df784a
Cleanup build actions
2024-12-07 15:55:53 +01:00
turboderp
ebaf819bc0
Merge remote-tracking branch 'origin/dev' into dev
2024-12-07 15:55:33 +01:00
turboderp
83a57c74ed
Bump to v0.2.6
2024-12-07 15:55:11 +01:00
turboderp
ba9774f1c8
Enable noise tokens for Qwen2-VL quantizatino
2024-12-07 15:53:52 +01:00
turboderp
c55656cc0c
Fix system RAM consumption while quantizing, fixes #692
2024-12-05 21:16:36 +01:00
turboderp
c86f62c3b8
Ensure MRoPE ID tensor is contiguous
2024-12-05 18:02:02 +01:00
Philipp Emanuel Weidmann
db78601226
Prevent NPE in deallocate_pages ( #688 )
...
Prevent NPE in `deallocate_pages`
If `deallocate_pages` is called on a job for which `allocate_pages`
has not been called (see `iterate_start_jobs` for conditions under
which this is true), `allocated_pages` is `None`, raising a NPE
when attempting to iterate.
In particular, this prevents `clear_queue` from working. In
practice, this problem readily occurs when starting a few jobs
and then calling `clear_queue`.
2024-12-01 22:02:32 +01:00
turboderp
663eea1b53
Fix 64-bit dtype for MSVC
2024-12-01 20:09:40 +01:00
turboderp
bc7db9395d
Merge remote-tracking branch 'origin/master'
v0.2.5
2024-12-01 14:29:44 +01:00
turboderp
e3b5549e0b
Bump to v0.2.5
2024-12-01 14:21:59 +01:00
turboderp
fa7e89c197
Update example
2024-12-01 14:20:33 +01:00
turboderp
48e6306193
Update chat example, prompt formats
2024-11-30 13:31:35 +01:00
turboderp
1f685bd8d3
Update grounding demo
2024-11-23 14:46:51 +01:00
turboderp
638cf3015f
Add Qwen2-VL grounding demo
2024-11-23 12:19:02 +01:00
turboderp
bfa4b4f043
Don't clamp FP32 residual during quantization
2024-11-22 09:30:36 +01:00
turboderp
142190e1f8
DRY: Avoid out-of-bounds error when computing penalty for sequence with image tokens
2024-11-22 02:13:26 +01:00
turboderp
9cacd66229
Fix MRoPE model inference when no MM embeddings present
2024-11-20 05:49:03 +01:00
turboderp
69bb9d6cff
Add optional noise embeddings during quantization
2024-11-20 05:48:22 +01:00
turboderp
5857ea9846
MLP: Fix temp state size calculation (for Qwen2-VL-72B mmp)
2024-11-18 16:31:57 +01:00
turboderp
c16aa9b3eb
Update multimodal example
2024-11-18 07:51:41 +01:00
turboderp
c81603c441
Update multimodal example
2024-11-18 07:16:48 +01:00
turboderp
6aab7064e2
Support MRoPE (dynamic gen prompt ingest only)
2024-11-18 05:33:11 +01:00
turboderp
b1e786cee3
Fix regression
2024-11-18 04:25:42 +01:00
turboderp
d6177d568f
Add grid def etc. MM embedding, keep embeddings in system RAM by default
2024-11-18 03:31:50 +01:00
turboderp
4d258742ed
Refactor RoPE initialization
2024-11-18 03:24:26 +01:00
turboderp
be3eeb403d
Add Qwen2-VL arch definition, preprocessor and vision tower
2024-11-16 11:36:42 +01:00
turboderp
2ac584cb24
MLP: Support quick_gelu in Torch fwd
2024-11-16 11:36:42 +01:00
turboderp
70bdb0969a
MLP: Group input states
2024-11-16 11:36:42 +01:00
turboderp
6405225582
Refactoring
2024-11-16 11:36:42 +01:00
turboderp
5129f96231
Support Conv3d
2024-11-16 11:26:40 +01:00
turboderp
1776223296
Pixtral: Fix hardcoded device ID
2024-11-16 11:23:27 +01:00
turboderp
576303a152
Fix unload for Conv2D
2024-11-16 04:46:27 +01:00
turboderp
735fa7b4c3
Update build action
2024-11-13 05:56:17 +01:00
turboderp
6f9a697b6a
Update build action
2024-11-12 09:32:50 +01:00
turboderp
c708e4fd84
BUild actions
2024-11-12 07:14:23 +01:00
turboderp
9961dbdcaf
Bump to 0.2.4
v0.2.4
2024-11-12 04:13:33 +01:00
turboderp
2a888dbd47
Pixtral example
2024-11-12 03:46:29 +01:00
turboderp
16cd5ef384
Generator: Make sampler settings optional instead of default arg
2024-11-12 03:41:59 +01:00
turboderp
90895967b1
Fix quantization for Pixtral, copy vision tower tensors to quantized model
2024-11-10 16:22:57 +01:00
turboderp
d37cf7e764
Fix regressions
2024-11-10 13:38:21 +01:00
turboderp
b28300c0db
Pixtral: Refactor vision model, update example
2024-11-10 12:34:42 +01:00
turboderp
7c876ef091
Update Pixtral experiment
2024-11-10 11:17:21 +01:00
turboderp
193a6b2b36
Pixtral: Add vision tower and preprocessor
2024-11-10 11:15:06 +01:00
turboderp
9504b515f7
Formatting
2024-11-10 11:13:49 +01:00
turboderp
a2f0f87713
Pixtral: Load vision tower and preprocessor config
2024-11-10 10:42:08 +01:00
turboderp
26406f9360
Make attn keys mappable, switch attn/MLP shapes for vision model
2024-11-10 10:40:58 +01:00
turboderp
79ca8fb65b
Add alt. RoPE sin/cos table as attn parameter, and non-causal option
2024-11-10 10:35:49 +01:00
turboderp
c5a21bccb7
Add Conv2D module
2024-11-10 10:32:08 +01:00
turboderp
525b3204e0
Fix PIL dependency, skip version check in preprocessor
2024-11-10 10:31:21 +01:00