VibeVoice - a bezzam Collection

bezzam 's Collections

Omnilingual ASR (1,600+ Languages)

Multimodel audio

Speech recognition datasets

Text-to-speech datasets

DigiCam (CelebA)

DiffuserCam Mirflickr

VibeVoice

updated Dec 8, 2025

bezzam/VibeVoice-1.5B

Text-to-Speech • 3B • Updated about 17 hours ago • 289 • 1
bezzam/VibeVoice-7B

Text-to-Speech • 9B • Updated about 17 hours ago • 555
bezzam/VibeVoice-AcousticTokenizer

Feature Extraction • 0.7B • Updated about 18 hours ago • 113
bezzam/VibeVoice-SemanticTokenizer

Feature Extraction • 0.3B • Updated Dec 3, 2025 • 2
bezzam/vibevoice_samples

Viewer • Updated Dec 10, 2025 • 2 • 492
VibeVoice Technical Report

Paper • 2508.19205 • Published Aug 26, 2025 • 141