Files
tabbyAPI/common
kingbri 871c89063d Model: Add Tensor Parallel support
Use the tensor parallel loader when the flag is enabled. The new loader
has its own autosplit implementation, so gpu_split_auto isn't valid
here.

Also make it easier to determine which cache type to use rather than
multiple if/else statements.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-08-22 14:15:19 -04:00
..
2024-08-22 14:15:19 -04:00
2024-04-29 01:15:02 -04:00
2024-07-22 21:40:00 -04:00
2024-07-26 18:33:04 -04:00