alibaba-pai
/

Z-Image-Turbo-Fun-Controlnet-Union

videox_fun

Model card Files Files and versions

xet

Community

bubbliiiing commited on 1 day ago

Commit

bd36100

verified ·

1 Parent(s): 3a50865

Update README.md

Browse files

Files changed (1) hide show

README.md +103 -102

README.md CHANGED Viewed

@@ -1,103 +1,104 @@
----
-license: apache-2.0
----
-# Z-Image-Turbo-Fun-Controlnet-Union
-[![Github](https://img.shields.io/badge/🎬%20Code-Github-blue)](https://github.com/aigc-apps/VideoX-Fun)
-## Model Features
-- This ControlNet is added on 6 blocks.
-- The model was trained from scratch for 10,000 steps on a dataset of 1 million high-quality images covering both general and human-centric content. Training was performed at 1328 resolution using BFloat16 precision, with a batch size of 64, a learning rate of 2e-5, and a text dropout ratio of 0.10.
-- It supports multiple control conditions—including Canny, HED, Depth, Pose and MLSD can be used like a standard ControlNet.
-- You can adjust control_context_scale for stronger control and better detail preservation. For better stability, we highly recommend using a detailed prompt. The optimal range for control_context_scale is from 0.65 to 0.80.
-## TODO
-- [ ] Train on more data and for more steps.
-- [ ] Support inpaint mode.
-## Results
-<table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
-  <tr>
-    <td>Pose</td>
-    <td>Output</td>
-  </tr>
-  <tr>
-    <td><img src="asset/pose2.jpg" width="100%" /></td>
-    <td><img src="results/pose2.png" width="100%" /></td>
-  </tr>
-</table>
-<table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
-  <tr>
-    <td>Pose</td>
-    <td>Output</td>
-  </tr>
-  <tr>
-    <td><img src="asset/pose.jpg" width="100%" /></td>
-    <td><img src="results/pose.png" width="100%" /></td>
-  </tr>
-</table>
-<table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
-  <tr>
-    <td>Canny</td>
-    <td>Output</td>
-  </tr>
-  <tr>
-    <td><img src="asset/canny.jpg" width="100%" /></td>
-    <td><img src="results/canny.png" width="100%" /></td>
-  </tr>
-</table>
-<table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
-  <tr>
-    <td>HED</td>
-    <td>Output</td>
-  </tr>
-  <tr>
-    <td><img src="asset/hed.jpg" width="100%" /></td>
-    <td><img src="results/hed.png" width="100%" /></td>
-  </tr>
-</table>
-<table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
-  <tr>
-    <td>Depth</td>
-    <td>Output</td>
-  </tr>
-  <tr>
-    <td><img src="asset/depth.jpg" width="100%" /></td>
-    <td><img src="results/depth.png" width="100%" /></td>
-  </tr>
-</table>
-## Inference
-Go to the VideoX-Fun repository for more details.
-Please clone the VideoX-Fun repository and create the required directories:
-```sh
-# Clone the code
-git clone https://github.com/aigc-apps/VideoX-Fun.git
-# Enter VideoX-Fun's directory
-cd VideoX-Fun
-# Create model directories
-mkdir -p models/Diffusion_Transformer
-mkdir -p models/Personalized_Model
-```
-Then download the weights into models/Diffusion_Transformer and models/Personalized_Model.
-```
-📦 models/
-├── 📂 Diffusion_Transformer/
-│   └── 📂 Z-Image-Turbo/
-├── 📂 Personalized_Model/
-│   └── 📦 Z-Image-Turbo-Fun-Controlnet-Union.safetensors
-```
 Then run the file `examples/z_image_fun/predict_t2i_control.py`.

+---
+license: apache-2.0
+library_name: videox_fun
+---
+# Z-Image-Turbo-Fun-Controlnet-Union
+[![Github](https://img.shields.io/badge/🎬%20Code-Github-blue)](https://github.com/aigc-apps/VideoX-Fun)
+## Model Features
+- This ControlNet is added on 6 blocks.
+- The model was trained from scratch for 10,000 steps on a dataset of 1 million high-quality images covering both general and human-centric content. Training was performed at 1328 resolution using BFloat16 precision, with a batch size of 64, a learning rate of 2e-5, and a text dropout ratio of 0.10.
+- It supports multiple control conditions—including Canny, HED, Depth, Pose and MLSD can be used like a standard ControlNet.
+- You can adjust control_context_scale for stronger control and better detail preservation. For better stability, we highly recommend using a detailed prompt. The optimal range for control_context_scale is from 0.65 to 0.80.
+## TODO
+- [ ] Train on more data and for more steps.
+- [ ] Support inpaint mode.
+## Results
+<table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
+  <tr>
+    <td>Pose</td>
+    <td>Output</td>
+  </tr>
+  <tr>
+    <td><img src="asset/pose2.jpg" width="100%" /></td>
+    <td><img src="results/pose2.png" width="100%" /></td>
+  </tr>
+</table>
+<table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
+  <tr>
+    <td>Pose</td>
+    <td>Output</td>
+  </tr>
+  <tr>
+    <td><img src="asset/pose.jpg" width="100%" /></td>
+    <td><img src="results/pose.png" width="100%" /></td>
+  </tr>
+</table>
+<table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
+  <tr>
+    <td>Canny</td>
+    <td>Output</td>
+  </tr>
+  <tr>
+    <td><img src="asset/canny.jpg" width="100%" /></td>
+    <td><img src="results/canny.png" width="100%" /></td>
+  </tr>
+</table>
+<table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
+  <tr>
+    <td>HED</td>
+    <td>Output</td>
+  </tr>
+  <tr>
+    <td><img src="asset/hed.jpg" width="100%" /></td>
+    <td><img src="results/hed.png" width="100%" /></td>
+  </tr>
+</table>
+<table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
+  <tr>
+    <td>Depth</td>
+    <td>Output</td>
+  </tr>
+  <tr>
+    <td><img src="asset/depth.jpg" width="100%" /></td>
+    <td><img src="results/depth.png" width="100%" /></td>
+  </tr>
+</table>
+## Inference
+Go to the VideoX-Fun repository for more details.
+Please clone the VideoX-Fun repository and create the required directories:
+```sh
+# Clone the code
+git clone https://github.com/aigc-apps/VideoX-Fun.git
+# Enter VideoX-Fun's directory
+cd VideoX-Fun
+# Create model directories
+mkdir -p models/Diffusion_Transformer
+mkdir -p models/Personalized_Model
+```
+Then download the weights into models/Diffusion_Transformer and models/Personalized_Model.
+```
+📦 models/
+├── 📂 Diffusion_Transformer/
+│   └── 📂 Z-Image-Turbo/
+├── 📂 Personalized_Model/
+│   └── 📦 Z-Image-Turbo-Fun-Controlnet-Union.safetensors
+```
 Then run the file `examples/z_image_fun/predict_t2i_control.py`.