SpecExit: Accelerating Large Reasoning Model via Speculative Exit Paper • 2509.24248 • Published Sep 29, 2025 • 2
Tequila: Trapping-free Ternary Quantization for Large Language Models Paper • 2509.23809 • Published Sep 28, 2025 • 3
Sherry: Hardware-Efficient 1.25-Bit Ternary Quantization via Fine-grained Sparsification Paper • 2601.07892 • Published Jan 12 • 4