snowflakewang's picture

5 61 7

snowflakewang

SnowflakeWang

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 2 days ago

How Much 3D Do Video Foundation Models Encode?

upvoted a paper 2 days ago

Spatia: Video Generation with Updatable Spatial Memory

authored a paper 3 days ago

Boosting 3D Object Generation through PBR Materials

View all activity

Organizations

None yet

upvoted 2 papers 2 days ago

How Much 3D Do Video Foundation Models Encode?

Paper • 2512.19949 • Published 5 days ago • 7

Spatia: Video Generation with Updatable Spatial Memory

Paper • 2512.15716 • Published 11 days ago • 20

upvoted a paper 3 days ago

Learning to Reason in 4D: Dynamic Spatial Understanding for Vision Language Models

Paper • 2512.20557 • Published 5 days ago • 46

upvoted a paper 4 days ago

SemanticGen: Video Generation in Semantic Space

Paper • 2512.20619 • Published 5 days ago • 87

upvoted a paper 9 days ago

N3D-VLM: Native 3D Grounding Enables Accurate Spatial Reasoning in Vision-Language Models

Paper • 2512.16561 • Published 10 days ago • 19

upvoted a paper 11 days ago

WorldPlay: Towards Long-Term Geometric Consistency for Real-Time Interactive World Modeling

Paper • 2512.14614 • Published 12 days ago • 65

upvoted 3 papers 13 days ago

Exploring MLLM-Diffusion Information Transfer with MetaCanvas

Paper • 2512.11464 • Published 16 days ago • 12

V-RGBX: Video Editing with Accurate Controls over Intrinsic Properties

Paper • 2512.11799 • Published 16 days ago • 29

EgoX: Egocentric Video Generation from a Single Exocentric Video

Paper • 2512.08269 • Published 19 days ago • 111

upvoted a paper 19 days ago

EgoEdit: Dataset, Real-Time Streaming Model, and Benchmark for Egocentric Video Editing

Paper • 2512.06065 • Published 23 days ago • 28

upvoted 4 papers 22 days ago

LATTICE: Democratize High-Fidelity 3D Generation at Scale

Paper • 2512.03052 • Published Nov 24 • 10

SIMA 2: A Generalist Embodied Agent for Virtual Worlds

Paper • 2512.04797 • Published 24 days ago • 23

NeuralRemaster: Phase-Preserving Diffusion for Structure-Aligned Generation

Paper • 2512.05106 • Published 24 days ago • 15

Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length

Paper • 2512.04677 • Published 24 days ago • 168

upvoted a paper 29 days ago

Video Generation Models Are Good Latent Reward Models

Paper • 2511.21541 • Published Nov 26 • 45

upvoted 5 papers about 1 month ago

SAM 3D: 3Dfy Anything in Images

Paper • 2511.16624 • Published Nov 20 • 109

GeoVista: Web-Augmented Agentic Visual Reasoning for Geolocalization

Paper • 2511.15705 • Published Nov 19 • 92

Reasoning via Video: The First Evaluation of Video Models' Reasoning Abilities through Maze-Solving Tasks

Paper • 2511.15065 • Published Nov 19 • 74

A Style is Worth One Code: Unlocking Code-to-Style Image Generation with Discrete Style Space

Paper • 2511.10555 • Published Nov 13 • 60

Part-X-MLLM: Part-aware 3D Multimodal Large Language Model

Paper • 2511.13647 • Published Nov 17 • 70