-
HopChain: Multi-Hop Data Synthesis for Generalizable Vision-Language Reasoning
Paper • 2603.17024 • Published • 106 -
WorldAgents: Can Foundation Image Models be Agents for 3D World Models?
Paper • 2603.19708 • Published • 12 -
MACRO: Advancing Multi-Reference Image Generation with Structured Long-Context Data
Paper • 2603.25319 • Published • 29
ZhengQi Wan
Vanqi
·
AI & ML interests
None yet
Recent Activity
updated a collection about 8 hours ago
From Vision to Motion upvoted a paper about 8 hours ago
Out of Sight but Not Out of Mind: Hybrid Memory for Dynamic Video World Models updated a collection 7 days ago
From Vision to MotionOrganizations
None yet