Reliable Fine-Grained Evaluation of Natural Language Math Proofs Paper • 2510.13888 • Published Oct 14, 2025 • 2