Where Did It Go Wrong? Process-Level Evaluation of Web Agents with Semantic State Tracking
Paper • 2606.15673 • Published • 13
None defined yet.
LaRA: Layer-wise Representation Analysis for Detecting Data Contamination in RL Post-Training
Rethinking State Tracking in Recurrent Models Through Error Control Dynamics