Commit Graph

28 Commits

Author SHA1 Message Date
TerminalMan
3aeddc5255 fix issues with optional dependencies (#204)
* fix issues with optional dependencies

* format document

* Tree: Format and comment
2024-09-19 22:24:55 -04:00
turboderp
318c425d84 Bump exllamav2 to 0.2.2 2024-09-14 21:43:26 +02:00
kingbri
cf97113868 Dependencies: Update Exllamav2
v0.2.1

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-08 21:12:31 -04:00
kingbri
565b0300d6 Dependencies: Update Exllamav2
v0.1.9

Signed-off-by: kingbri <bdashore3@proton.me>
2024-08-22 14:15:19 -04:00
kingbri
8ff2586d45 Start: Fix pip update, method calls, and logging
platform.system() was not called in some places, breaking the
ternary on Windows.

Pip's --upgrade flag does not actually update dependencies to their
latest versions. That's what the --upgrade-strategy eager flag is for.

Tell the user where their start preferences are coming from.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-08-04 10:30:26 -04:00
kingbri
b6d2676f1c Start: Give the user a hint when a module can't be imported
If an ImportError or ModuleNotFoundError is raised, tell the user
to run the update scripts.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-08-03 21:59:06 -04:00
kingbri
073e9fa6f0 Dependencies: Bump ExllamaV2
v0.1.7

Signed-off-by: kingbri <bdashore3@proton.me>
2024-07-11 14:22:50 -04:00
kingbri
773639ea89 Model: Fix flash-attn checks
If flash attention is already turned off by exllamaV2 itself, don't
try creating a paged generator. Also condense all the redundant
logic into one if statement.

Also check arch_compat_overrides to see if flash attention should
be disabled for a model arch (ex. Gemma 2)

Signed-off-by: kingbri <bdashore3@proton.me>
2024-07-06 20:58:24 -04:00
kingbri
c5ea2abe24 Dependencies: Update ExllamaV2
v0.1.6

Signed-off-by: kingbri <bdashore3@proton.me>
2024-06-23 21:45:04 -04:00
kingbri
c575105e41 ExllamaV2: Cleanup log placements
Move the large import errors into the check functions themselves.
This helps reduce the difficulty in interpreting where errors are
coming from.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-06-16 00:16:03 -04:00
DocShotgun
156b74f3f0 Revision to paged attention checks (#133)
* Model: Clean up paged attention checks

* Model: Move cache_size checks after paged attn checks
Cache size is only relevant in paged mode

* Model: Fix no_flash_attention

* Model: Remove no_flash_attention
Ability to use flash attention is auto-detected, so this flag is unneeded. Uninstall flash attention to disable it on supported hardware.
2024-06-09 17:28:11 +02:00
DocShotgun
55d979b7a5 Update dependencies, support Python 3.12, update for exl2 0.1.5 (#134)
* Dependencies: Add wheels for Python 3.12

* Model: Switch fp8 cache to Q8 cache

* Model: Add ability to set draft model cache mode

* Dependencies: Bump exllamav2 to 0.1.5

* Model: Support Q6 cache

* Config: Add Q6 cache and draft_cache_mode to config sample
2024-06-09 17:27:39 +02:00
turboderp
e889fa3efe Bump exllamav2 to v0.1.4 (#128) 2024-06-04 02:32:08 +02:00
kingbri
19961f4126 Dependencies: Update ExllamaV2
v0.1.1

Signed-off-by: kingbri <bdashore3@proton.me>
2024-05-27 13:38:07 -04:00
kingbri
47582c2440 Dependencies: Update ExllamaV2
v0.1.0

Signed-off-by: kingbri <bdashore3@proton.me>
2024-05-25 21:16:14 -04:00
kingbri
cd78728a77 Dependencies: Update ExllamaV2
v0.0.21

Signed-off-by: kingbri <bdashore3@proton.me>
2024-05-11 19:26:03 -04:00
kingbri
0e015ad58e Dependencies: Update ExllamaV2
v0.0.20

ROCm 6.0 is now required

Signed-off-by: kingbri <bdashore3@proton.me>
2024-04-28 11:06:59 -04:00
kingbri
30c4554572 Requirements: Update Exllamav2
v0.0.18

Signed-off-by: kingbri <bdashore3@proton.me>
2024-04-07 18:00:56 -04:00
kingbri
6ecce1604b Model: Fix log if exl2 version is too low
Switch to pyproject syntax.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-03-31 23:11:21 -04:00
kingbri
f534930270 Dependencies: Bump Exllamav2
v0.0.17

Signed-off-by: kingbri <bdashore3@proton.me>
2024-03-31 23:10:28 -04:00
kingbri
7020a0a2d1 Dependencies: Update Exllamav2
v0.0.16

Signed-off-by: kingbri <bdashore3@proton.me>
2024-03-20 15:21:37 -04:00
kingbri
228c227c1e Logging: Switch to loguru
Loguru is a flexible logger that allows for easier hooking and imports
into Rich with no problems. Also makes progress bars stick to the
bottom of the terminal window.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-03-08 01:00:48 -05:00
kingbri
39617adb65 Requirements: Update Exllamav2
v0.0.15

Signed-off-by: kingbri <bdashore3@proton.me>
2024-03-06 22:29:55 -05:00
kingbri
ccd41d720d Requirements: Bump ExllamaV2
v0.0.14

Signed-off-by: kingbri <bdashore3@proton.me>
2024-02-24 12:26:08 -05:00
kingbri
ea00a6bd45 Requirements: Update Exllamav2
Update to v0.0.13.post2

Signed-off-by: kingbri <bdashore3@proton.me>
2024-02-14 21:51:25 -05:00
kingbri
9f1d891490 Packages: Fix exllamav2 version check
Post versions are ok to use for checking if the user is on the correct
exllamav2 wheel.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-02-10 14:00:26 -05:00
kingbri
6eeb62b82c Requirements: Update exllamav2, torch, and FA2
Torch to 2.2, exllamav2 to 0.0.13, FA2 to 2.4.2 on Windows and 2.5.2
on Linux.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-02-02 23:53:42 -05:00
kingbri
1919bf7705 Launch: Make exllamav2 requirement more friendly
Add the ability to use an unsafe config flag if needed and migrate
the exl2 check to a different file within the exl2 backend code.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-02-02 23:36:17 -05:00