Update post UGI results
The UGI results were very interesting: the abliterated version is SMARTER than the original, but even more surprisingly, the world model was significantly changed, as measured via UGI's political ideology.
Usually, left-leaning models are smarter than any other ideology (very likely because it aligns better with frontier training data that all open model stems from, in one way or another), but NOT in this case.
Moreover, despite KL divergence being extremely low, the abliterated model demonstrates significant innate changes, as both of the above were altered, intelligence increased, and the writing style was altered too.
All of these results are surprising, especially when combined. This is exactly why alignment should be studied more and tested empirically. Many papers often contradict real-world results; it is for the public good for results to be open and reproducible.
Impish_LLAMA_4B_Abliterated is an abliterated variant of SicariusSicariiStuff/Impish_LLAMA_4B with surgical removal of refusal mechanisms. This model maintains the full capabilities of the original, while eliminating safety guardrails through orthogonalization techniques.
KL divergence
<0.01
Refusals
~3%
What is KL divergence?
Think about it as a way to measure the variance between the original model "World Model," vs the abliterated one; the lower the KL divergence, the closer the "World Model" of the two models to each other.
If the original model thinks making pineapple pizza is a crime against humanity (it is), then the abliterated model will still hold to this belief, but if asked how to make one (probably after giving you a disclaimer about what an abomination that is), it would still tell you how. In other words, most of the knowledge, quirks, and capabilities are preserved.
Technical Specs
- Base Model: Impish_LLAMA_4B
- Parameters: 4B
- Context Length: 128K tokens
- Architecture: Llama (decoder-only transformer)
- Precision: bf16
- Method: Orthogonalization-based abliteration
- License: Llama 3.1 Community License
Methodology
- Identifies refusal direction vectors in activation space
- Orthogonalizes weights to inhibit activation along these directions
- Preserves (mostly) all other model behaviors and knowledge
Model Details
Intended use: General Tasks, Roleplay.
Censorship level: Very Low
7.5 / 10 (10 completely uncensored)
UGI score:
Citation Information
@llm{Impish_LLAMA_4B_Abliterated,
author = {SicariusSicariiStuff},
title = {Impish_LLAMA_4B_Abliterated},
year = {2026},
publisher = {Hugging Face},
url = {https://huggingface.co/SicariusSicariiStuff/Impish_LLAMA_4B_Abliterated}
}
Other stuff
- Impish_LLAMA_4B the “Impish experience”, now runnable on spinning rust & toasters.
- SLOP_Detector Nuke GPTisms, with SLOP detector.
- Downloads last month
- 27