🤏 Smol-Data Collection Tried and tested mixes for strong pretraining. Inspired by https://huggingface.co/blog/codelion/optimal-dataset-mixing • 14 items • Updated Mar 2 • 13
Parallel Loop Transformer for Efficient Test-Time Computation Scaling Paper • 2510.24824 • Published Oct 28, 2025 • 18
MuPT: A Generative Symbolic Music Pretrained Transformer Paper • 2404.06393 • Published Apr 9, 2024 • 16
Running on CPU Upgrade Featured 403 ML Intern 🤖 403 Explore machine learning tasks via an interactive web app
view article Article Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries +7 aminediroHF, qgallouedec, kashif, lewtun, edbeeching, albertvillanova, nouamanetazi, lvwerra, sergiopaniego • Mar 10 • 164