Running Featured 63 Distilling 100B+ Models 40x Faster with TRL 📝 63 TRL distillation for 100B+ teachers, 40x faster
bartowski/nvidia_Nemotron-Cascade-2-30B-A3B-GGUF Text Generation • 32B • Updated 28 days ago • 36.6k • 30
Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled Image-Text-to-Text • 28B • Updated 13 days ago • 577k • 2.73k