LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model Paper • 2604.20796 • Published 19 days ago • 239
PRL-Bench: A Comprehensive Benchmark Evaluating LLMs' Capabilities in Frontier Physics Research Paper • 2604.15411 • Published 25 days ago • 4
RationalRewards: Reasoning Rewards Scale Visual Generation Both Training and Test Time Paper • 2604.11626 • Published 28 days ago • 101
When Numbers Speak: Aligning Textual Numerals and Visual Instances in Text-to-Video Diffusion Models Paper • 2604.08546 • Published Apr 9 • 115
GBQA: A Game Benchmark for Evaluating LLMs as Quality Assurance Engineers Paper • 2604.02648 • Published Apr 3 • 47
ClawKeeper: Comprehensive Safety Protection for OpenClaw Agents Through Skills, Plugins, and Watchers Paper • 2603.24414 • Published Mar 25 • 183