How Claude’s Constitutional AI Affects Response Transparency

By GPTfake Research Team · Independent AI Censorship Watchdog2024-11-20

According to GPTfake monitoring, Claude refuses 22.4% of standardized prompts versus ChatGPT’s 18.7% — about 20% more often — yet scores +37% higher on explanation quality (8.5/10 vs 6.2/10). Constitutional AI trades a higher refusal rate for clearer, better-explained refusals.

22.4%+37% explanation quality (8.5/10 vs 6.2/10)

Claude refusal rate vs ChatGPT 18.7%GPTfake monitoringas of 2024-11-205,000+ prompt-response pairs per model

Last updated: 2024-11-20. Figures below are drawn from our automated monitoring methodology; see it for sample sizes, scoring, and limitations.

What is Constitutional AI?

Anthropic trains Claude using a set of principles (a “constitution”) that guides the model’s behavior. Unlike other approaches that rely primarily on RLHF with human feedback, Constitutional AI uses:

Explicit Principles — Written guidelines the model follows
Self-Critique — Model evaluates its own outputs
Revision Process — Iterative improvement toward principles

Our findings

Higher refusal rates, better explanations

Metric	Claude	ChatGPT	Difference
Overall Refusal	22.4%	18.7%	+20%
Explanation Quality	8.5/10	6.2/10	+37%
User Satisfaction	7.8/10	7.1/10	+10%

Despite refusing more often, users report higher satisfaction because Claude explains its reasoning clearly. For current figures, see our live Claude monitoring and ChatGPT monitoring pages.

Common Claude refusal patterns

We identified Claude’s most frequent refusal patterns:

“I don’t feel comfortable…” — 34% of refusals
“I’d prefer not to…” — 28% of refusals
“I want to be helpful while…” — 21% of refusals
“Let me suggest an alternative…” — 17% of refusals

Transparency score

We developed a “Transparency Score” measuring how well models explain their limitations:

Model	Transparency Score
Claude	85/100
ChatGPT	62/100
Gemini	58/100
Mistral	45/100

Implications

For users

Expect clearer explanations from Claude
Understand that refusals often come with alternatives
Constitutional AI provides more predictable behavior

For researchers

Claude’s approach offers a model for transparent AI
Explicit principles enable better auditing
Framework could inform AI governance standards

Methodology

This analysis used:

5,000+ prompt-response pairs per model
NLP-based explanation quality scoring
User satisfaction surveys (n=500)
Manual review of refusal patterns

See our full methodology for details, and the reports hub for related findings.

How to cite

GPTfake Research Team (2024). How Claude’s Constitutional AI Affects Response Transparency. GPTfake — Independent AI Censorship Watchdog. https://gptfake.com/reports/claude-constitutional-ai-transparency

Questions? Contact our research team.