TAO71-AI Quants: Other
Collection
8 items
β’
Updated
| Quant | Size | Description |
|---|---|---|
| Q2_K | 1.17 GB | Not recommended for most people. Very low quality. |
| Q2_K_L | 1.23 GB | Not recommended for most people. Uses Q8_0 for output and embedding, and Q2_K for everything else. Very low quality. |
| Q2_K_XL | 1.46 GB | Not recommended for most people. Uses F16 for output and embedding, and Q2_K for everything else. Very low quality. |
| Q3_K_S | 1.33 GB | Not recommended for most people. Prefer any bigger Q3_K quantization. Low quality. |
| Q3_K_M | 1.46 GB | Not recommended for most people. Low quality. |
| Q3_K_L | 1.57 GB | Not recommended for most people. Low quality. |
| Q3_K_XL | 1.63 GB | Not recommended for most people. Uses Q8_0 for output and embedding, and Q3_K_L for everything else. Low quality. |
| Q3_K_XXL | 1.86 GB | Not recommended for most people. Uses F16 for output and embedding, and Q3_K_L for everything else. Low quality. |
| Q4_K_S | 1.69 GB | Recommended. Slightly low quality. |
| Q4_K_M | 1.78 GB | Recommended. Decent quality for most use cases. |
| Q4_K_L | 1.84 GB | Recommended. Uses Q8_0 for output and embedding, and Q4_K_M for everything else. Decent quality. |
| Q4_K_XL | 2.07 GB | Recommended. Uses F16 for output and embedding, and Q4_K_M for everything else. Decent quality. |
| Q5_K_S | 2.01 GB | Recommended. High quality. |
| Q5_K_M | 2.06 GB | Recommended. High quality. |
| Q5_K_L | 2.12 GB | Recommended. Uses Q8_0 for output and embedding, and Q5_K_M for everything else. High quality. |
| Q5_K_XL | 2.35 GB | Recommended. Uses F16 for output and embedding, and Q5_K_M for everything else. High quality. |
| Q6_K | 2.36 GB | Recommended. Very high quality. |
| Q6_K_L | 2.42 GB | Recommended. Uses Q8_0 for output and embedding, and Q6_K for everything else. Very high quality. |
| Q6_K_XL | 2.65 GB | Recommended. Uses F16 for output and embedding, and Q6_K for everything else. Very high quality. |
| Q8_0 | 3.05 GB | Recommended. Quality almost like F16. |
| F16 | 5.74 GB | Not recommended. Overkill. Prefer Q8_0. |
| ORIGINAL (BF16) | 5.74 GB | Not recommended. Overkill. Prefer Q8_0. |
Quantized using TAO71-AI AutoQuantizer. You can check out the original model card here.
Base model
HuggingFaceTB/SmolLM3-3B-Base