view article Article Demystifying DeepSeekMath’s Data Pipeline: A FastText-Based Reproduction and Analysis Jun 1 • 4
view article Article Demystifying DeepSeekMath’s Data Pipeline: A FastText-Based Reproduction and Analysis Jun 1 • 4
view article Article A failed experiment: Infini-Attention, and why we should keep trying? +1 Aug 14, 2024 • 71
Towards Achieving Human Parity on End-to-end Simultaneous Speech Translation via LLM Agent Paper • 2407.21646 • Published Jul 31, 2024 • 18
view article Article Finetuning clip can be done locally with decent results (even if you are GPU poor). Jun 28, 2024 • 9