aquiffoo/neo-3-1B-A90M-Base
Text Generation • 1.0B • Updated
• 22 • 2
My series of fully open, state-of-the-art small mixture-of-experts models.
Note Base model 990M total, ~100M active
Note Instruct model 990M total, ~100M active
Note NeoFlashInfer-based Base model 990M total, ~100M active
Note Base model 3.11B total, ~380M active
Note Thinking model 3.11B total, ~380M active
Note NeoFlashInfer-based Base model 3.11B total, ~380M active
Note Pretraining data for both models (1B-A90M, 3B-A400M and Flash variants)