MoCha Demo Implementation

MoCha is a pioneering model for Dialogue-driven Movie Shot Generation.

| 🌐Project Page | 📖Paper | 🔗Github | 🤗Demo|

This repository provides a demo implementation of MoCha Towards Movie-Grade Talking Character Synthesis. built on top of HunyuanVideo.

We fine-tune HunyuanVideo on the Hallo3 dataset. Due to differences in training data, model scale, and training strategy, this demo does not fully reproduce the performance of the original MoCha model, but it reflects the core design and and serves as a baseline for further research and study.

This implementation supports two generation modes:

st2v: speech + text → video

sti2v: image + speech + text → video

Check out the 🔗Github for usage.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for CongWei1230/MoCha-Demo

Finetuned
(32)
this model

Collection including CongWei1230/MoCha-Demo