Thread three - a LizRob6913 Collection

LizRob6913 's Collections

Visual

Thread three

updated 3 days ago

Can LLMs Guide Their Own Exploration? Gradient-Guided Reinforcement Learning for LLM Reasoning

Paper • 2512.15687 • Published 14 days ago • 17
ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration

Paper • 2511.21689 • Published Nov 26 • 110