Too Good to be Bad: On the Failure of LLMs to Role-Play Villains Paper • 2511.04962 • Published Nov 7, 2025 • 54
OmniSpatial: Towards Comprehensive Spatial Reasoning Benchmark for Vision Language Models Paper • 2506.03135 • Published Jun 3, 2025 • 40
Revisiting Modeling and Evaluation Approaches in Speech Emotion Recognition: Considering Subjectivity of Annotators and Ambiguity of Emotions Paper • 2510.05934 • Published Oct 7, 2025 • 2
VCRL: Variance-based Curriculum Reinforcement Learning for Large Language Models Paper • 2509.19803 • Published Sep 24, 2025 • 120
Video-MTR: Reinforced Multi-Turn Reasoning for Long Video Understanding Paper • 2508.20478 • Published Aug 28, 2025 • 17
nablaNABLA: Neighborhood Adaptive Block-Level Attention Paper • 2507.13546 • Published Jul 17, 2025 • 124
Thinking with Images for Multimodal Reasoning: Foundations, Methods, and Future Frontiers Paper • 2506.23918 • Published Jun 30, 2025 • 89
Geometry-Editable and Appearance-Preserving Object Compositon Paper • 2505.20914 • Published May 27, 2025 • 6
Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence Paper • 2505.23747 • Published May 29, 2025 • 68
ComfyUI-Copilot: An Intelligent Assistant for Automated Workflow Development Paper • 2506.05010 • Published Jun 5, 2025 • 80
ObjectDrop: Bootstrapping Counterfactuals for Photorealistic Object Removal and Insertion Paper • 2403.18818 • Published Mar 27, 2024 • 28 • 4