Test-Time Compute/Optimal Scaling - a alexngai Collection

alexngai 's Collections

Latent Reasoning

Autonomous Research

Memory/Search/Retrieval/RAG

Automated Research

Test-Time Compute/Optimal Scaling

Self-Improving Agents

Codegen Benchmarks

Test-Time Compute/Optimal Scaling

updated Dec 9, 2025

Scaling LLM Inference with Optimized Sample Compute Allocation

Paper • 2410.22480 • Published Oct 29, 2024
Test-time Computing: from System-1 Thinking to System-2 Thinking

Paper • 2501.02497 • Published Jan 5, 2025 • 45
Scaling of Search and Learning: A Roadmap to Reproduce o1 from Reinforcement Learning Perspective

Paper • 2412.14135 • Published Dec 18, 2024
Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though

Paper • 2501.04682 • Published Jan 8, 2025 • 99
O1 Replication Journey: A Strategic Progress Report -- Part 1

Paper • 2410.18982 • Published Oct 8, 2024 • 3
O1 Replication Journey -- Part 3: Inference-time Scaling for Medical Reasoning

Paper • 2501.06458 • Published Jan 11, 2025 • 31
DOTS: Learning to Reason Dynamically in LLMs via Optimal Reasoning Trajectories Search

Paper • 2410.03864 • Published Oct 4, 2024 • 12
Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs

Paper • 2501.18585 • Published Jan 30, 2025 • 61
SRMT: Shared Memory for Multi-agent Lifelong Pathfinding

Paper • 2501.13200 • Published Jan 22, 2025 • 69
Demystifying Long Chain-of-Thought Reasoning in LLMs

Paper • 2502.03373 • Published Feb 5, 2025 • 58
Inference-Time Scaling for Generalist Reward Modeling

Paper • 2504.02495 • Published Apr 3, 2025 • 58
TTRL: Test-Time Reinforcement Learning

Paper • 2504.16084 • Published Apr 22, 2025 • 120
Scaling Test-time Compute for LLM Agents

Paper • 2506.12928 • Published Jun 15, 2025 • 63