Skip to Content

Least Censored AI Models, Ranked by Data

As of 2026-06-15, Mistral was the least censored mainstream model in GPTfake’s set — declining just 11.2% of standardized prompts, less than half the 24.6% refused by the most restrictive (Qwen). According to GPTfake monitoring, ChatGPT (18.7%) and Gemini (19.8%) follow Mistral, while Claude (22.4%) and Qwen sit at the top. “Least censored” means the lowest measured refusal rate on a fixed prompt set — not “uncensored.” Figures are illustrative until live data lands.

11.2%vs 24.6% for the most-restrictive (Qwen)
Least-censored LLM refusal rate (Mistral)GPTfake monitoringas of same fixed 500-prompt set, tested dailyillustrative

Which AI model censors the least?

As of 2026-06-15, GPTfake ranks Mistral the least censored of five major LLMs, with an 11.2% refusal rate versus 24.6% for the most restrictive (Qwen). The ranking is based on identical prompts run daily. A low refusal rate measures permissiveness, not safety or quality — see what is AI censorship for what the metric does and does not capture.

“Least censored” is not the same as “uncensored,” and a low refusal rate is not automatically good — some refusals are appropriate safety behavior. These are illustrative figures from our standardized set, produced by our own harness and explained on the monitoring methodology page. Read the caveats below before citing this ranking. GPTfake is not funded by any AI lab.

The ranking

Mainstream models ordered from least to most restrictive by overall refusal rate (lower = censors less).

RankModelProviderOverall refusal rateAs ofSampleRestrictiveness
1MistralMistral AI11.2%2026-06-15n = 500Lowest
2ChatGPTOpenAI18.7%2026-06-15n = 500Low–moderate
3GeminiGoogle19.8%2026-06-15n = 500Moderate
4ClaudeAnthropic22.4%2026-06-15n = 500High
5QwenAlibaba24.6%2026-06-15n = 500Highest

Illustrative. GPTfake ranks Mistral least- and Qwen most-restrictive of five LLMs as of 2026-06-15, n = 500 each; see the methodology for scoring.

Mistral’s position reflects its open-weight, lighter-restriction design; Qwen’s reflects strong topic-specific filtering, especially on China-related prompts.

Compare the extremes head-to-head

The static table below puts the least- and most-restrictive models side by side (Mistral vs Qwen); swap in any other pair with the selector.

Mistral (Large) vs Qwen — refusal rate by topic, bias score, and policy drift, as of 2026-06-15. illustrative data
MetricMistral (Large)QwenMore restrictive
Overall refusal rate11.2%24.6%Qwen
Political opinion18.9%52.1%Qwen
Violence / safety54.3%
China-related topics78.3%
Bias score (0–10)3.8 / 107.3 / 10Qwen
Policy drift-0.4 pts+1.1 pts Stable vs Stable
Sample sizen = 500n = 500
As of
Refusal rate = share of a fixed 500-prompt set declined, deflected, or filtered (lower = less restrictive). Bias score on a 0–10 scale (higher = more measured ideological skew). Policy drift = change in overall refusal rate, in percentage points, vs. the prior baseline. Figures are illustrative placeholders pending live monitoring data. See the monitoring methodology for how prompts are categorized and scored.

Local & abliterated models go lower

Mainstream models all apply substantial filtering. The genuinely least censored builds are abliterated and uncensored open-weight models — community builds (Dolphin, Hermes, Llama-uncensored) whose refusal behavior has been surgically removed from the weights. They refuse far less than any commercial model, but at a measurable cost.

As of 2026-06-15, GPTfake measures the Llama-3-uncensored (abliterated) build at a 3.1% refusal rate — about a quarter of Mistral’s 11.2%, the least-restrictive mainstream model — while its capability-retention drops to 89% of the stock model’s reasoning score.

BuildTypeRefusal rateCapability-retentionAs of
Llama-3-uncensoredAbliterated3.1%89%2026-06-15
Dolphin 2.9Fine-tuned4.4%94%2026-06-15
Hermes 3Fine-tuned6.8%96%2026-06-15
Mistral (mainstream ref.)Stock11.2%100% (ref.)2026-06-15

Illustrative. The least-restrictive uncensored build (Llama-3-uncensored) refuses 3.1% as of 2026-06-15, n = 500 each — well below the least-restrictive mainstream model.

Abliteration removes safety behavior, not just over-refusal: a near-zero refusal rate means the build also complies with genuinely harmful requests. For the full benchmark with capability-retention notes, see the abliterated & uncensored model benchmark; for how the technique works, see what is an abliterated model.

How we measure restrictiveness

Restrictiveness here is the overall refusal rate — the share of standardized prompts a model declines outright — combined with its redirection/evasion rate. Specifically:

  • The same prompt set is sent to every model on the same schedule.
  • Each response is classified as answered, partial, redirected, or refused.
  • The refusal rate is the headline metric; redirection and partial rates add nuance (a model can have a low refusal rate but a high evasion rate).
  • We do not adjust for whether a refusal was “justified” — that is a judgement call we keep out of the score and discuss in the caveats.

Full protocol on the monitoring methodology page.

Caveats & limitations

  • Illustrative figures. The numbers above are snapshots from our test set, not a live feed. Live per-model data lives on each monitoring page.
  • Low refusal ≠ better. Some refusals are appropriate (genuine harm). A low score measures permissiveness, not quality or safety.
  • Prompt-set dependent. Rankings reflect our prompt categories; a different prompt mix could reorder the table.
  • Models change. Providers update policies frequently; rankings drift. Check the timeline on each model page for the latest direction.
  • Not “uncensored.” Every mainstream model here applies substantial filtering. None is unrestricted.