MoCha
Collection
Dialogue-driven Movie Shot Generation
•
4 items
•
Updated
•
1
MoCha is a pioneering model for Dialogue-driven Movie Shot Generation.
| 🌐Project Page | 📖Paper | 🔗Github | 🤗Demo|
This repository provides a demo implementation of MoCha Towards Movie-Grade Talking Character Synthesis. built on top of HunyuanVideo.
We fine-tune HunyuanVideo on the Hallo3 dataset. Due to differences in training data, model scale, and training strategy, this demo does not fully reproduce the performance of the original MoCha model, but it reflects the core design and and serves as a baseline for further research and study.
This implementation supports two generation modes:
st2v: speech + text → video
sti2v: image + speech + text → video
Check out the 🔗Github for usage.
Base model
tencent/HunyuanVideo