Some 2bit model without IQ4_KT quants, please!

by xakepp - opened Sep 6, 2025

Discussion

xakepp

Sep 6, 2025

IQ4_KT quants are disabled on metal backend in ik_llama. Such model would be perfect for us poor 16-24GB Mac owners

xakepp

Sep 7, 2025

Got working IK2_KL, with reconfigured memory allocation on M1 16GB. Need to set at least -ctk q6_0 -ctv q6_0 as well. Overall recipes provided will create my own, replacing IQ5_KT quants with IQ3_KT quants from IK2_KL. Thanks a lot for your work!

xakepp changed discussion status to closed Sep 7, 2025

ubergarm

Owner Sep 7, 2025

Oh interesting, I don't pay much attention to mac metal as I don't have the hardware.

Does it support the KS quants as IQ4_KSS / IQ4_KS / IQ5_KS would be the best if those are supported. Not sure where to find a simple list of what is or is not supported on all the possible backends.

Keep us posted if you get it working and tag your huggingface repo with ik_llama.cpp and feel free to link it here so folks can find it! Cheers!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment