Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies Paper • 2512.19673 • Published 4 days ago • 59
PodAgent: A Comprehensive Framework for Podcast Generation Paper • 2503.00455 • Published Mar 1 • 6
Towards Robust Speech Representation Learning for Thousands of Languages Paper • 2407.00837 • Published Jun 30, 2024 • 11
BASE TTS: Lessons from building a billion-parameter Text-to-Speech model on 100K hours of data Paper • 2402.08093 • Published Feb 12, 2024 • 62
BASE TTS: Lessons from building a billion-parameter Text-to-Speech model on 100K hours of data Paper • 2402.08093 • Published Feb 12, 2024 • 62
simonl0909/whisper-large-v2-cantonese Automatic Speech Recognition • 2B • Updated Sep 30, 2023 • 243 • 13