shoaibmohd
's Collections
Learning from examples - training/inference
updated
ExGRPO: Learning to Reason from Experience
Paper
•
2510.02245
•
Published
•
80
A Practitioner's Guide to Multi-turn Agentic Reinforcement Learning
Paper
•
2510.01132
•
Published
•
5
Agentic Context Engineering: Evolving Contexts for Self-Improving
Language Models
Paper
•
2510.04618
•
Published
•
127
MixReasoning: Switching Modes to Think
Paper
•
2510.06052
•
Published
•
21
Agent Learning via Early Experience
Paper
•
2510.08558
•
Published
•
270
Learning on the Job: An Experience-Driven Self-Evolving Agent for
Long-Horizon Tasks
Paper
•
2510.08002
•
Published
•
23
When Thoughts Meet Facts: Reusable Reasoning for Long-Context LMs
Paper
•
2510.07499
•
Published
•
48
Dr.LLM: Dynamic Layer Routing in LLMs
Paper
•
2510.12773
•
Published
•
31
Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning
Paper
•
2511.16043
•
Published
•
108
Agent-R1: Training Powerful LLM Agents with End-to-End Reinforcement Learning
Paper
•
2511.14460
•
Published
•
20