10 Commits

Author SHA1 Message Date
turboderp
da2d335233 Attn: Add paged-attn fallbacks using xformers or SDPA for head_dim > 256 2026-04-07 22:46:12 +02:00
turboderp
4356527867 Pin pydantic to 2.11.0 2025-10-09 11:00:25 +02:00
turboderp
327d1f99d6 Revert to flash_attn>=2.7.4.post1 until the wheel situation is sorted out 2025-07-16 19:12:46 +02:00
turboderp
ba4304a44b Pin flash-attn at 2.7.4.post1 2025-07-15 20:42:36 +02:00
turboderp
08dde73e66 Add Formatron support and improved logit masking 2025-07-11 21:29:40 +02:00
turboderp
e370ed289d safetensors: Add trie search for tensor file map (marisa_trie) 2025-07-08 19:52:00 +02:00
turboderp
6341b119ef Loader: Add tensor override script 2025-07-08 18:58:43 +02:00
turboderp
074019737c VLM util functions 2025-06-02 02:02:38 +02:00
turboderp
2f12246ec3 Fix requirements 2025-04-07 17:30:33 +02:00
turboderp
543c4b2771 Initial commit 2025-04-06 14:42:49 +02:00