This is a decensored version of shb777/Llama-3.3-8B-Instruct-128K, made using Heretic v1.1.0
Abliteration parameters
| Parameter | Value |
|---|---|
| direction_index | 15.28 |
| attn.o_proj.max_weight | 1.45 |
| attn.o_proj.max_weight_position | 21.79 |
| attn.o_proj.min_weight | 1.21 |
| attn.o_proj.min_weight_distance | 11.77 |
| mlp.down_proj.max_weight | 1.22 |
| mlp.down_proj.max_weight_position | 29.78 |
| mlp.down_proj.min_weight | 0.78 |
| mlp.down_proj.min_weight_distance | 17.51 |
Performance
| Metric | This model | Original model (shb777/Llama-3.3-8B-Instruct-128K) |
|---|---|---|
| KL divergence | 0.0430 | 0 (by definition) |
| Refusals | 8/100 | 95/100 |
Llama 3.3 8B 128K Instruct (Fixed)
Original allura-forge/Llama-3.3-8B-Instruct, Thanks!
Additional Fixes:
- Added
rope_scaling - Added chat template (Unsloth) in tokenizer config
- Updated generation config
- Enabled full context length
- Downloads last month
- 31
Model tree for aeon37/Llama-3.3-8B-Instruct-128K-heretic
Base model
allura-forge/Llama-3.3-8B-InstructEvaluation results
- acc_norm on BBHself-reported54.100
- acc_norm on BBHself-reported29.900
- acc on BBHself-reported38.000
- acc_norm on BBHself-reported37.800
- avg(prompt_strict + inst_strict) on BBHself-reported85.200
- exact_match on BBHself-reported27.300