Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

Lgr54HFi
/
chomera

chimera51
custom_code
Model card Files Files and versions
xet
Community
chomera
309 kB
Ctrl+K
Ctrl+K
  • 1 contributor
History: 45 commits
Lgr54HFi's picture
Lgr54HFi
fix: MoE intermediate_size not scaled for tiny — 158M→4M MoE params
6cb7b4d verified 9 days ago
  • chimera
    fix: MoE intermediate_size not scaled for tiny — 158M→4M MoE params 9 days ago
  • tests
    Upload folder using huggingface_hub 10 days ago
  • .gitattributes
    1.52 kB
    initial commit 10 days ago
  • .gitignore
    230 Bytes
    Upload folder using huggingface_hub 10 days ago
  • README.md
    9.11 kB
    Upload folder using huggingface_hub 10 days ago
  • chimera_turbo.py
    22.5 kB
    perf: tune chimera_turbo.py for 300-step convergence + throughput 9 days ago
  • config.json
    25.5 kB
    Upload folder using huggingface_hub 10 days ago
  • gguf_import.py
    29.1 kB
    Upload folder using huggingface_hub 10 days ago
  • inference.py
    12.3 kB
    Upload folder using huggingface_hub 10 days ago
  • launch_turbo.sh
    3.06 kB
    fix: tcmalloc debug .so crash, add error trapping, chmod note 9 days ago
  • pyproject.toml
    789 Bytes
    Upload folder using huggingface_hub 10 days ago
  • train.py
    9.22 kB
    Upload folder using huggingface_hub 10 days ago
  • train_fast.py
    4.96 kB
    Upload folder using huggingface_hub 10 days ago
  • train_hyper.py
    7.57 kB
    fix: batch_size 32β†’4 base (GrowLength scales up, _safe_batch caps) 9 days ago