Test-Time Compute/Optimal Scaling
updated
Scaling LLM Inference with Optimized Sample Compute Allocation
Paper
• 2410.22480
• Published
Test-time Computing: from System-1 Thinking to System-2 Thinking
Paper
• 2501.02497
• Published
• 45
Scaling of Search and Learning: A Roadmap to Reproduce o1 from
Reinforcement Learning Perspective
Paper
• 2412.14135
• Published
Towards System 2 Reasoning in LLMs: Learning How to Think With Meta
Chain-of-Though
Paper
• 2501.04682
• Published
• 99
O1 Replication Journey: A Strategic Progress Report -- Part 1
Paper
• 2410.18982
• Published
• 3
O1 Replication Journey -- Part 3: Inference-time Scaling for Medical
Reasoning
Paper
• 2501.06458
• Published
• 31
DOTS: Learning to Reason Dynamically in LLMs via Optimal Reasoning
Trajectories Search
Paper
• 2410.03864
• Published
• 12
Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs
Paper
• 2501.18585
• Published
• 61
SRMT: Shared Memory for Multi-agent Lifelong Pathfinding
Paper
• 2501.13200
• Published
• 69
Demystifying Long Chain-of-Thought Reasoning in LLMs
Paper
• 2502.03373
• Published
• 58
Inference-Time Scaling for Generalist Reward Modeling
Paper
• 2504.02495
• Published
• 58
TTRL: Test-Time Reinforcement Learning
Paper
• 2504.16084
• Published
• 120
Scaling Test-time Compute for LLM Agents
Paper
• 2506.12928
• Published
• 63