Masking Teacher and Reinforcing Student for Distilling Vision-Language Models Paper • 2512.22238 • Published 7 days ago • 13
4D-RGPT: Toward Region-level 4D Understanding via Perceptual Distillation Paper • 2512.17012 • Published 12 days ago • 42
Maximizing Data Efficiency for Cross-Lingual TTS Adaptation by Self-Supervised Representation Mixing and Embedding Initialization Paper • 2402.01692 • Published Jan 23, 2024 • 1
Generative Speech Foundation Model Pretraining for High-Quality Speech Extraction and Restoration Paper • 2409.16117 • Published Sep 24, 2024
DeSTA2.5-Audio: Toward General-Purpose Large Audio Language Model with Self-Generated Cross-Modal Alignment Paper • 2507.02768 • Published Jul 3 • 18
Awesome papers from 臺大李宏毅 (Hung-yi Lee) Collection Recent papers authored by Hung-yi Lee. Sorted by ID • 8 items • Updated Oct 24 • 17
IMPACT: Iterative Mask-based Parallel Decoding for Text-to-Audio Generation with Diffusion Modeling Paper • 2506.00736 • Published May 31 • 10
DeSTA2.5-Audio: Toward General-Purpose Large Audio Language Model with Self-Generated Cross-Modal Alignment Paper • 2507.02768 • Published Jul 3 • 18