view article Article Getting More from Your Test-Time Compute Budget with Portfolio Beam Search danelbaz • Feb 24 • 8
Prune Once for All: Sparse Pre-Trained Language Models Paper • 2111.05754 • Published Nov 10, 2021 • 2
view article Article DeepMath: A lightweight math reasoning Agent with smolagents +1 danf, mber, moshew • Dec 4, 2025 • 40
view article Article Accelerating Qwen3-8B Agent on Intel® Core™ Ultra with Depth-Pruned Draft Models +3 imargulis, ofirzaf, sguskin, guybd, pcuenq • Sep 29, 2025 • 25
view article Article Accelerating Qwen3-8B Agent on Intel® Core™ Ultra with Depth-Pruned Draft Models +3 imargulis, ofirzaf, sguskin, guybd, pcuenq • Sep 29, 2025 • 25
view article Article Breaking Language Barriers in Mathematical AI: Introducing Hebrew Math Tutor danf • Sep 7, 2025 • 3
view article Article Introducing HELMET: Holistically Evaluating Long-context Language Models +5 hyen, gaotianyu1350, houminmin, kding1, danf, moshew, cdq10131 • Apr 16, 2025 • 42
view article Article Speeding Up LLM Decoding with Advanced Universal Assisted Generation Techniques jmamou • Mar 24, 2025 • 20
SQuARE: Sequential Question Answering Reasoning Engine for Enhanced Chain-of-Thought in Large Language Models Paper • 2502.09390 • Published Feb 13, 2025 • 16
Speculative Decoding Draft Models Collection Collection of OpenVINO optimized efficient draft models for speculative decoding • 5 items • Updated 29 days ago • 13
view article Article A Chatbot on your Laptop: Phi-2 on Intel Meteor Lake +4 juliensimon, echarlaix, ofirzaf, imargulis, guybd, moshew • Mar 20, 2024 • 7