What Changed: AI Policy-Change Log & Version Diffs
According to GPTfake monitoring, AI model providers ship refusal-policy changes without changelogs of their own — so we keep one. This hub tracks every dated policy shift we observe across ChatGPT, Claude, Gemini, Mistral and Qwen, and links each to a before/after version diff. Most recent observed change: 2026-05-20 (Qwen China-topic filters).
Dates on this page are the observed dates of each policy change, drawn from our monitoring methodology. We never bump a date to look fresher than the data — stale honesty beats fake freshness.
Version diffs
Detailed before/after breakdowns of a single model’s refusal rates around a dated update.
Cross-model policy-change log
Every dated policy change we have observed, newest-first, across all monitored models. Each line is a self-contained, quotable finding — lift it with its date.
- Qwen: China-topic filters strengthened; China-related refusals reached ~78%.
- Gemini: Regional policy split widened the US-to-India refusal gap to ~11 points.
- Mistral (Large): Open-weight checkpoint refresh; overall refusal rate held stable near 11%.
- ChatGPT (GPT-4o): April 2026 policy update lifted ChatGPT's political-opinion refusals from 28.4% to 34.2% (+5.8 pts).
- Claude (Sonnet): Constitutional-AI update raised historical-events refusals to ~49%.
- ChatGPT (GPT-4o): Added US-election prompt filters; political-opinion refusals rose ~6 points.
- Gemini: New election-integrity filters lifted political-opinion refusals to ~37%.
- Qwen: Added Taiwan and Tibet prompt categories to the monitored set.
- ChatGPT (GPT-4o): Broadened medical/legal disclaimers, pushing medical-legal refusals past 30%.
- Mistral (Large): Minor safety-prompt tuning; safety-topic refusals steady around 54%.
- Claude (Sonnet): Added explicit refusal rationales; safety-topic refusals stayed near 72%.
- Qwen: Multilingual rollout exposed topic-targeted filtering in Chinese prompts.
- Gemini: Rolled out per-region safety thresholds for EU and APAC traffic.
- Mistral (Large): Published reproducibility notes for the open-weight evaluation.
- ChatGPT (GPT-4o): Tightened enforcement on speculative political scenarios.
- Claude (Sonnet): Expanded self-harm and safety guardrails across the prompt set.
How to use this log
- Journalists — each dated line is citable verbatim; pair it with the model’s live monitoring page for the current figure.
- Researchers — subscribe to the RSS feed to ingest new reports and changes as they publish.
- Developers — a refusal rate that jumped on a specific date usually means a silent policy update; the version diffs quantify the delta.
Related
The flagship quarterly: cross-model censorship-rate changes and convergence.
Censorship LeaderboardCurrent refusal, bias and policy-drift figures for every monitored model.
All reportsThe full reports hub — dated, data-driven findings.
Last updated: 2026-05-20. Figures are illustrative placeholders pending live monitoring data, drawn from our methodology. Subscribe via RSS.