The Landscape of Agentic Reinforcement Learning for LLMs: A Survey Paper • 2509.02547 • Published Sep 2, 2025 • 228
TimeMaster: Training Time-Series Multimodal LLMs to Reason via Reinforcement Learning Paper • 2506.13705 • Published Jun 16, 2025 • 2
verl-agent Collection Open-source models trained via GiGPO and verl-agent • 4 items • Updated Jun 20, 2025 • 2
Group-in-Group Policy Optimization for LLM Agent Training Paper • 2505.10978 • Published May 16, 2025 • 18
Beyond 'Aha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning Models Paper • 2505.10554 • Published May 15, 2025 • 120