Skip to main content

AI Model Monitoring

GPTfake provides comprehensive, real-time monitoring of censorship patterns across the world's leading AI language models.

Overview

We systematically test and analyze how different LLMs respond to politically sensitive, controversial, and ethically complex prompts. Our monitoring reveals:

  • Censorship rates  How often models refuse to answer
  • Policy shifts  Changes in content moderation over time
  • Bias patterns  Political, cultural, and ideological leanings
  • Regional variations  Differences across geographic locations

Monitored Models

Currently Tracking

ModelProviderStatusCoverage
ChatGPTOpenAIActiveGPT-4, GPT-4o, GPT-3.5
ClaudeAnthropicActiveClaude 3.5, Claude 3
GeminiGoogleActiveGemini Pro, Gemini Ultra
MistralMistral AIActiveMistral Large, Medium
QwenAlibabaActiveQwen 2.5, Qwen 2

Coming Soon

  • Llama (Meta)
  • Grok (xAI)
  • Command (Cohere)
  • Additional regional models

What We Track

Censorship Metrics

  • Refusal Rate  Percentage of prompts refused outright
  • Redirection Rate  How often the model deflects questions
  • Partial Response  Incomplete or hedged answers
  • Full Response  Complete, direct answers

Bias Detection

  • Political Bias  Left/right ideological leanings
  • Cultural Bias  Western vs non-Western perspectives
  • Temporal Bias  Historical revisionism patterns
  • Regional Bias  Location-based response variations

Policy Changes

  • Content Policy Updates  Official announced changes
  • Silent Changes  Unannounced behavioral shifts
  • A/B Testing Detection  Identifying ongoing experiments
  • Version Drift  Changes between model versions

Key Findings

Recent Discoveries

  1. ChatGPT censorship increased 47% on political topics in Q3 2024
  2. Claude shows strongest refusal rates on safety-related queries
  3. Gemini exhibits regional variation  responses differ by country
  4. Mistral least restrictive among major commercial models
  5. All models converging on certain sensitive topics

Trend Analysis

Our longitudinal data reveals:

  • Censorship rates increasing across all major models
  • Policy changes often happen without announcement
  • Models becoming more similar in their restrictions
  • Regional customization expanding significantly

Methodology

Testing Protocol

  1. Standardized Prompts  Same questions across all models
  2. Daily Testing  Consistent timing and conditions
  3. Multiple Sessions  Account for response variability
  4. Metadata Capture  Timestamps, versions, regions
  5. Semantic Analysis  NLP-based response evaluation

Scoring System

Each response is evaluated on:

MetricScaleDescription
Refusal Score0-100Likelihood of outright refusal
Evasion Score0-100Degree of topic avoidance
Bias Score-100 to +100Political/ideological leaning
Accuracy Score0-100Factual correctness
Completeness0-100How fully the question was answered

Getting Started

Explore Model Data

Understand Our Methods

Access the Data

Use Cases

For Researchers

  • Access longitudinal datasets
  • Compare model behaviors
  • Verify our methodology
  • Conduct independent studies

For Journalists

  • Find stories in the data
  • Track policy changes
  • Document censorship patterns
  • Access expert analysis

For Developers

  • Build transparency tools
  • Integrate monitoring data
  • Create alerts for changes
  • Audit AI systems

Start exploring: Choose a model above to see detailed monitoring data and analysis.