* iq2_kt: Metal dequantize
* iq2_kt: Metal GEMV
Performance is actually quite decent: 52 t/s on my M2-Max for LlaMA-3.1-8B
* iq3_kt: Metal dequantize
* iq3_kt: Metal GEMV
Performance is not as good as iq2_kt: 40 t/s on my M2-Max for LlaMA-3.1-8B.
Flipping signs is a costly affair.
* iq4_kt: Metal dequantize - getting NaNs
* iq4_kt: Metal GEMV - also not working
* iq4_kt: Metal still not working
* Disable iq4_kt on Metal for now
---------
Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>