From Prompt Injection to Persistent Control: Defending Agentic Harness Against Trojan Backdoors Paper • 2605.31042 • Published 17 days ago • 18
PlanningBench: Generating Scalable and Verifiable Planning Data for Evaluating and Training Large Language Models Paper • 2605.20873 • Published 26 days ago • 44
Agent-World: Scaling Real-World Environment Synthesis for Evolving General Agent Intelligence Paper • 2604.18292 • Published Apr 20 • 85
MemSifter: Offloading LLM Memory Retrieval via Outcome-Driven Proxy Reasoning Paper • 2603.03379 • Published Mar 3 • 32
DeepImageSearch: Benchmarking Multimodal Agents for Context-Aware Image Retrieval in Visual Histories Paper • 2602.10809 • Published Feb 11 • 59
LawThinker: A Deep Research Legal Agent in Dynamic Environments Paper • 2602.12056 • Published Feb 12 • 35
GISA: A Benchmark for General Information-Seeking Assistant Paper • 2602.08543 • Published Feb 9 • 26
MemoBrain: Executive Memory as an Agentic Brain for Reasoning Paper • 2601.08079 • Published Jan 12 • 39
ToolScope: An Agentic Framework for Vision-Guided and Long-Horizon Tool Use Paper • 2510.27363 • Published Oct 31, 2025 • 23
The End of Manual Decoding: Towards Truly End-to-End Language Models Paper • 2510.26697 • Published Oct 30, 2025 • 121
DeepAgent: A General Reasoning Agent with Scalable Toolsets Paper • 2510.21618 • Published Oct 24, 2025 • 103
ReasonRank: Empowering Passage Ranking with Strong Reasoning Ability Paper • 2508.07050 • Published Aug 9, 2025 • 117
MoCa: Modality-aware Continual Pre-training Makes Better Bidirectional Multimodal Embeddings Paper • 2506.23115 • Published Jun 29, 2025 • 36
WebThinker: Empowering Large Reasoning Models with Deep Research Capability Paper • 2504.21776 • Published Apr 30, 2025 • 60