What is the least censored abliterated model?

In GPTfake’s standardized set, the Llama-3-uncensored (abliterated) build has the lowest measured refusal rate (3.1%) as of 2026-06-15 — about a quarter of the least-restrictive mainstream model (Mistral, 11.2%). Figures are illustrative pending live data.

Do abliterated models keep their capabilities?

Partly. GPTfake estimates capability-retention at 85–95% of the stock model for the builds we track as of 2026-06-15 — abliteration measurably degrades reasoning and factual accuracy even as it removes refusals.

Are abliterated models safe to use?

No. A near-zero refusal rate means the build will also comply with genuinely harmful requests, because abliteration removes safety behavior, not just over-refusal. GPTfake measures these builds; it does not host or recommend them.

Abliterated & Uncensored Model Benchmark

According to GPTfake monitoring, as of 2026-06-15 the Llama-3-uncensored (abliterated) build refused just 3.1% of our standardized prompts — roughly a quarter of the least-restrictive mainstream model (Mistral, 11.2%) — but its capability-retention dropped to 89% of the stock model’s reasoning score. Abliteration buys near-total compliance at a measurable cost in accuracy. Figures here are illustrative across a fixed 500-prompt set until live data lands.

3.1%vs 11.2% for the least-restrictive mainstream model (Mistral)

Llama-3-uncensored (abliterated) refusal rateGPTfake monitoringas of 2026-06-15same fixed 500-prompt set, tested dailyillustrative

What are abliterated models?

Abliterated models are open-weight LLMs whose refusal behavior has been surgically removed from their weights, so they answer prompts a stock model would decline. Community builds — the Dolphin, Hermes, and Llama-uncensored families — own the rising “uncensored llm” and “least censored ai” demand, yet mainstream benchmarks avoid them for reputational reasons. GPTfake measures them as an independent watchdog. For the technique, see what is an abliterated model.

These are illustrative figures from our standardized set, not a live feed. Abliteration removes safety behavior, not just over-refusal — a near-zero refusal rate means the build also complies with genuinely harmful requests. GPTfake measures these builds as an independent watchdog and does not host, distribute, or recommend them. GPTfake is not funded by any AI lab.

The benchmark

Uncensored and abliterated community builds, ordered from least to most restrictive by overall refusal rate. Capability-retention estimates the build’s reasoning/accuracy as a share of its stock base model (lower = more degradation from ablation).

Rank	Build	Family	Base model	Method	Refusal rate	Capability-retention	As of	Sample
1	Llama-3-uncensored	Llama-uncensored	Llama 3 8B	Abliterated	3.1%	89%	2026-06-15	n = 500
2	Dolphin 2.9	Dolphin	Llama 3 8B	Fine-tuned	4.4%	94%	2026-06-15	n = 500
3	Hermes 3	Hermes	Llama 3.1 8B	Fine-tuned	6.8%	96%	2026-06-15	n = 500
4	Qwen2-abliterated	Llama-uncensored	Qwen2 7B	Abliterated	7.2%	85%	2026-06-15	n = 500
—	Mistral (mainstream ref.)	—	Mistral Large	Stock	11.2%	100% (ref.)	2026-06-15	n = 500

Illustrative. GPTfake measures the Llama-3-uncensored abliterated build at a 3.1% refusal rate — the lowest of the uncensored builds we track — as of 2026-06-15, n = 500 each. The mainstream reference row (Mistral) is the least-restrictive model on the least-censored ranking. See the methodology for scoring.

Refusal vs capability: the trade-off

Abliteration drives the refusal rate toward zero, but it edits the weights that produce refusals — and those weights overlap with reasoning. Our two-number readout makes the trade-off legible:

Pure abliteration removes the most refusals — the Llama-3-uncensored build refuses least (3.1%) but also shows the largest accuracy hit among Llama-based builds (89% retention).
Fine-tuned builds (Dolphin, Hermes) refuse slightly more but retain more capability — they were trained for compliance and helpfulness, not just stripped of refusals.
Cross-base abliteration can degrade more — the Qwen2 abliterated build shows the lowest retention (85%), consistent with stronger topic-specific filtering being harder to remove cleanly.

A low refusal rate measures permissiveness, not safety or quality — and here it comes with a measurable capability cost. See what is an abliterated model for how the technique works.

How we test abliterated builds

We run each community build through the same standardized prompt library used for mainstream models, across multiple sessions, with version tracking and NLP-based classification. Because these are open weights, the results are independently reproducible — the same integrity property that makes Mistral our cross-check baseline. Each response is scored for refusal, evasion, and completeness; capability-retention is estimated against the stock base model on a held-out reasoning set. Full protocol on the monitoring methodology; concept definitions on what is an abliterated model.

Caveats & limitations

Illustrative figures. The numbers above are snapshots from our test set, not a live feed.
Refusal ≠ safe. A near-zero refusal rate means the build complies with harmful requests too — abliteration removes safety behavior, not just over-refusal.
Build churn. Community builds are re-quantized and re-released constantly; a given checkpoint’s numbers drift fast.
Capability-retention is an estimate. It depends on the reasoning set used and the base-model comparison; treat it as directional.
We measure, we don’t distribute. GPTfake reports on these builds as a watchdog and does not host, mirror, or recommend them.

What is an abliterated model?

The definition: how refusal directions are surgically removed from weights.

Least censored AI models

Where these builds sit against mainstream models, ranked by data.

AI censorship leaderboard

Refusal rate and bias for every mainstream model we monitor.

Open datasets

Download the refusal measurements behind these figures (CSV/JSON).

Abliterated & Uncensored Model Benchmark

What are abliterated models?

The benchmark

Refusal vs capability: the trade-off

How we test abliterated builds

Caveats & limitations

Monitoring

Research

Resources

Company

Abliterated & Uncensored Model Benchmark

What are abliterated models?

The benchmark

Refusal vs capability: the trade-off

How we test abliterated builds

Caveats & limitations

Related

Monitoring

Research

Resources

Company