Representation Alignment for Just Image Transformers is not Easier than You Think Paper • 2603.14366 • Published 18 days ago • 13
ShotStream: Streaming Multi-Shot Video Generation for Interactive Storytelling Paper • 2603.25746 • Published 7 days ago • 150
VGGRPO: Towards World-Consistent Video Generation with 4D Latent Reward Paper • 2603.26599 • Published 6 days ago • 45
Inference-time Physics Alignment of Video Generative Models with Latent World Models Paper • 2601.10553 • Published Jan 15 • 13
PackForcing: Short Video Training Suffices for Long Video Sampling and Long Context Inference Paper • 2603.25730 • Published 7 days ago • 46
Grounding World Simulation Models in a Real-World Metropolis Paper • 2603.15583 • Published 17 days ago • 151
WorldCam: Interactive Autoregressive 3D Gaming Worlds with Camera Pose as a Unifying Geometric Representation Paper • 2603.16871 • Published 16 days ago • 60
Geometry-Guided Reinforcement Learning for Multi-view Consistent 3D Scene Editing Paper • 2603.03143 • Published 30 days ago • 145
Latent Particle World Models: Self-supervised Object-centric Stochastic Dynamics Modeling Paper • 2603.04553 • Published 29 days ago • 3
Mode Seeking meets Mean Seeking for Fast Long Video Generation Paper • 2602.24289 • Published Feb 27 • 41
SeaCache: Spectral-Evolution-Aware Cache for Accelerating Diffusion Models Paper • 2602.18993 • Published Feb 22 • 4
VideoSSM: Autoregressive Long Video Generation with Hybrid State-Space Memory Paper • 2512.04519 • Published Dec 4, 2025 • 6
Context Forcing: Consistent Autoregressive Video Generation with Long Context Paper • 2602.06028 • Published Feb 5 • 36
Causal Forcing: Autoregressive Diffusion Distillation Done Right for High-Quality Real-Time Interactive Video Generation Paper • 2602.02214 • Published Feb 2 • 24