mirror of
https://github.com/ikawrakow/ik_llama.cpp.git
synced 2026-03-09 13:30:17 +00:00
Much slower than the fp16 based trellis. I guess, Apple doesn't have int8_t SIMD on the M2-Max GPU.
Much slower than the fp16 based trellis. I guess, Apple doesn't have int8_t SIMD on the M2-Max GPU.