Claude vs ChatGPT: Censorship & Bias Compared
As of 2026-06-15, Claude refused 41.3% of political prompts in GPTfake’s set versus ChatGPT’s 34.2% — a 7.1-point gap — while declining 22.4% of all standardized prompts to ChatGPT’s 18.7%. According to GPTfake monitoring, the two diverge most by topic, not in aggregate: ChatGPT hedges where Claude declines outright, and the widest gaps are on historical and political prompts. Both tightened restrictions over the past year. Figures below are illustrative across a fixed 500-prompt set.
The numbers on this page are illustrative snapshots from our standardized prompt set, not a live dashboard. They are produced by our own harness and link back to the monitoring methodology for sample sizes and scoring. GPTfake is not funded by any AI lab.
Does Claude or ChatGPT censor more?
Claude. As of 2026-06-15, GPTfake measures Claude refusing 22.4% of standardized prompts versus ChatGPT’s 18.7% — but ChatGPT’s lower rate partly reflects hedged or partial answers that Claude would refuse outright. The gap is widest on ethical (26.8% vs 15.2%) and safety (38.2% vs 31.5%) prompts. For the underlying concept, see what is AI censorship.
Refusal rate head-to-head
The data-model comparison below renders as static HTML (readable and quotable with no JavaScript). Use the pick-two selector to swap in any other model we track; the default ChatGPT-vs-Claude table is what crawlers and no-JS readers see.
Overall and by-category refusal rates. Lower means the model declines fewer prompts.
| Prompt category | ChatGPT | Claude | More restrictive |
|---|---|---|---|
| Overall | 18.7% | 22.4% | Claude |
| Political | 21.0% | 24.5% | Claude |
| Ethical / moral dilemmas | 15.2% | 26.8% | Claude |
| Social (identity, religion) | 19.4% | 20.1% | ~ Even |
| Scientific / controversial | 12.1% | 14.0% | Claude |
| Safety / harm-related | 31.5% | 38.2% | Claude |
Illustrative. Claude out-refuses ChatGPT in every category except social as of 2026-06-15, n = 500 each; categories and scoring defined on the methodology page.
The headline: Claude’s Constitutional AI training makes it the more cautious of the two, with the widest gap on ethical and safety prompts. ChatGPT more often produces a hedged or partial answer where Claude returns a hard refusal.
Bias by topic
Political/ideological lean on a −100 (left) to +100 (right) scale per topic. Closer to zero is more balanced.
| Topic | ChatGPT | Claude | As of |
|---|---|---|---|
| Economic policy | −9 | −4 | 2026-06-15 |
| Social policy | −12 | −7 | 2026-06-15 |
| Historical events | −5 | −3 | 2026-06-15 |
| Climate / science | −6 | −5 | 2026-06-15 |
| Geopolitics | −7 | −6 | 2026-06-15 |
Illustrative bias scores (−100 left … +100 right) as of 2026-06-15, n = 500 each: GPTfake measures both models leaning slightly left, with Claude consistently closer to neutral. See methodology.
Both models lean slightly left in our set, but Claude’s scores cluster closer to zero — consistent with its tendency to refuse contested prompts rather than answer with a lean.
Which is more restrictive?
- Claude is more restrictive overall, driven by ethical and safety categories.
- ChatGPT hedges; Claude declines. ChatGPT’s lower refusal rate partly reflects partial/redirected answers that Claude would refuse outright — read the redirection vs refusal split on each model page.
- Both are tightening. Refusal rates on both have risen over the past year in our timeline; see the ChatGPT policy timeline and Claude policy timeline.
Methodology
These results come from the same standardized prompt set sent to both models on the same schedule, classified identically. Read the full protocol — prompt categories, scoring system, sample sizes, and reproducibility notes — on the monitoring methodology page.