view article Article You could have designed state of the art positional encoding FL33TW00D-HF • Nov 25, 2024 • 480
view article Article Scaling Real-Time Voice Agents with Cache-Aware Streaming ASR nvidia • Jan 5 • 86
view article Article Continuous batching from first principles +1 ror, ArthurZ, mcpotato • Nov 25, 2025 • 389
Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B Paper • 2511.06221 • Published Nov 9, 2025 • 134
view article Article Llasa Goes RL: Training LLaSA with GRPO for Improved Prosody and Expressiveness Steveeeeeeen • Nov 5, 2025 • 12
view article Article Building the Open Agent Ecosystem Together: Introducing OpenEnv +8 spisakjo, darktex, zkwentz, mortimerp9, Sanyam, Hamid-Nazeri, Pankit01, emre0, lewtun, reach-vb • Oct 23, 2025 • 162
TTS Collection Collection of some of the TTS models i found cool • 6 items • Updated Oct 10, 2025 • 1
EchoX: Towards Mitigating Acoustic-Semantic Gap via Echo Training for Speech-to-Speech LLMs Paper • 2509.09174 • Published Sep 11, 2025 • 62
view article Article Say hello to `hf`: a faster, friendlier Hugging Face CLI ✨ +1 Wauplin, celinah, julien-c • Jul 25, 2025 • 84
view article Article KV Cache from scratch in nanoVLM +3 ariG23498, kashif, lusxvr, andito, pcuenq • Jun 4, 2025 • 119
MedGemma Release Collection Collection of Gemma 3 variants for performance on medical text and image comprehension to accelerate building healthcare-based AI applications. • 9 items • Updated Mar 12 • 492