view article Article Performant local mixture-of-experts CPU inference with GPU acceleration in llama.cpp Jan 30 • 17
view article Article Welcome Gemma 4: Frontier multimodal intelligence on device +5 12 days ago • 838
Direct Language Model Alignment from Online AI Feedback Paper • 2402.04792 • Published Feb 7, 2024 • 35
view article Article GGML and llama.cpp join HF to ensure the long-term progress of Local AI +4 Feb 20 • 501