ZClip: Adaptive Spike Mitigation for LLM Pre-Training Paper • 2504.02507 • Published Apr 3, 2025 • 90
Variance Control via Weight Rescaling in LLM Pre-training Paper • 2503.17500 • Published Mar 21, 2025 • 5
NeRF-MAE: Masked AutoEncoders for Self-Supervised 3D Representation Learning for Neural Radiance Fields Paper • 2404.01300 • Published Apr 1, 2024 • 4
Understanding Video Transformers via Universal Concept Discovery Paper • 2401.10831 • Published Jan 19, 2024 • 9