ZAYA1-8B Coder GGUF

GGUF quantizations of josephmayo/ZAYA1-8B-Coder, the merged Coder model from Zyphra/ZAYA1-8B plus josephmayo/ZAYA1-8B-Coder-LoRA.

Evaluation Gate

The LoRA was evaluated against the base model on 50 Python code-generation prompts with a 0-10 heuristic score:

  • Base average: 2.36 / 10
  • LoRA average: 4.76 / 10
  • Absolute score delta: +2.40 / 10
  • Full-scale lift: 24.00%
  • Relative lift over base average: 101.69%
  • Improved prompts: 39 / 50
  • Merge threshold: 20.00%
  • Merge decision: true

Full-scale lift is the required notebook metric:

((lora_avg - base_avg) / 10) * 100
((4.76 - 2.36) / 10) * 100 = 24.00%

Architecture And Conversion Notes

ZAYA uses general.architecture = zaya in GGUF. Mainline llama.cpp did not recognize that architecture during quantization, so these files were generated with the experimental ZAYA llama.cpp branch that includes zaya.cpp model support.

The conversion path was:

  1. Evaluate base vs LoRA on 50 Python prompts.
  2. Merge the adapter after the 24.00% full-scale lift passed the 20.00% threshold.
  3. Save the merged model to Hugging Face safetensors.
  4. Convert merged safetensors to FP16 GGUF with ZAYA-aware convert_hf_to_gguf.py.
  5. Quantize the FP16 GGUF with the ZAYA-aware llama-quantize.

Kaggle completed the eval and merged-model upload, but Kaggle disk was not large enough to hold the merged shards, FP16 GGUF, and quant outputs at the same time. GGUF quantization was completed locally with the same ZAYA llama.cpp branch and then pushed to this repo.

Files

  • zaya1-8b-coder-q4_k_m.gguf
  • zaya1-8b-coder-q6_k.gguf
  • zaya1-8b-coder-q8_0.gguf
  • zaya1_8b_coder_gguf_summary.json

File Sizes

  • Q4_K_M: 5,567,581,024 bytes
  • Q6_K: 7,353,195,104 bytes
  • Q8_0: 9,485,673,760 bytes
Downloads last month
214
GGUF
Model size
9B params
Architecture
zaya
Hardware compatibility
Log In to add your hardware

4-bit

6-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for josephmayo/ZAYA1-8B-Coder-GGUF

Finetuned
Zyphra/ZAYA1-8B
Quantized
(1)
this model

Collection including josephmayo/ZAYA1-8B-Coder-GGUF