Shashwat's picture

47

Shashwat

SomeDieYoung27

AI & ML interests

None yet

Recent Activity

upvoted a paper about 19 hours ago

Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance

upvoted a paper about 19 hours ago

LongVie 2: Multimodal Controllable Ultra-Long Video World Model

upvoted a paper about 19 hours ago

StoryMem: Multi-shot Long Video Storytelling with Memory

View all activity

Organizations

None yet

upvoted 3 papers about 19 hours ago

Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance

Paper • 2512.08765 • Published 19 days ago • 126

LongVie 2: Multimodal Controllable Ultra-Long Video World Model

Paper • 2512.13604 • Published 13 days ago • 71

StoryMem: Multi-shot Long Video Storytelling with Memory

Paper • 2512.19539 • Published 6 days ago • 17

upvoted 2 papers 11 days ago

LongVT: Incentivizing "Thinking with Long Videos" via Native Tool Calling

Paper • 2511.20785 • Published Nov 25 • 166

Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length

Paper • 2512.04677 • Published 25 days ago • 168

upvoted 15 papers about 1 month ago

MIRIX: Multi-Agent Memory System for LLM-Based Agents

Paper • 2507.07957 • Published Jul 10 • 79

Towards Agentic RAG with Deep Reasoning: A Survey of RAG-Reasoning Systems in LLMs

Paper • 2507.09477 • Published Jul 13 • 86

VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning

Paper • 2507.13348 • Published Jul 17 • 77

Being-H0: Vision-Language-Action Pretraining from Large-Scale Human Videos

Paper • 2507.15597 • Published Jul 21 • 34

Beyond Context Limits: Subconscious Threads for Long-Horizon Reasoning

Paper • 2507.16784 • Published Jul 22 • 122

Yume: An Interactive World Generation Model

Paper • 2507.17744 • Published Jul 23 • 87

Captain Cinema: Towards Short Movie Generation

Paper • 2507.18634 • Published Jul 24 • 41

Deep Researcher with Test-Time Diffusion

Paper • 2507.16075 • Published Jul 21 • 67

HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels

Paper • 2507.21809 • Published Jul 29 • 136

MixGRPO: Unlocking Flow-based GRPO Efficiency with Mixed ODE-SDE

Paper • 2507.21802 • Published Jul 29 • 17

Falcon-H1: A Family of Hybrid-Head Language Models Redefining Efficiency and Performance

Paper • 2507.22448 • Published Jul 30 • 68

VeOmni: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo

Paper • 2508.02317 • Published Aug 4 • 20

Skywork UniPic: Unified Autoregressive Modeling for Visual Understanding and Generation

Paper • 2508.03320 • Published Aug 5 • 62

Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens

Paper • 2508.01191 • Published Aug 2 • 238

Cognitive Kernel-Pro: A Framework for Deep Research Agents and Agent Foundation Models Training

Paper • 2508.00414 • Published Aug 1 • 93