The Truth is in There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction Paper • 2312.13558 • Published Dec 21, 2023 • 5
Grokking of Hierarchical Structure in Vanilla Transformers Paper • 2305.18741 • Published May 30, 2023