EVOLVE-VLA: Test-Time Training from Environment Feedback for Vision-Language-Action Models Paper • 2512.14666 • Published 10 days ago • 8
EVOLVE-VLA: Test-Time Training from Environment Feedback for Vision-Language-Action Models Paper • 2512.14666 • Published 10 days ago • 8
VCode: a Multimodal Coding Benchmark with SVG as Symbolic Visual Representation Paper • 2511.02778 • Published Nov 4 • 101
LongVie: Multimodal-Guided Controllable Ultra-Long Video Generation Paper • 2508.03694 • Published Aug 5 • 51
Long-Context Autoregressive Video Modeling with Next-Frame Prediction Paper • 2503.19325 • Published Mar 25 • 73
ShowUI: One Vision-Language-Action Model for GUI Visual Agent Paper • 2411.17465 • Published Nov 26, 2024 • 89