Watching the Watchers: Building an AI Accountability Timeline

February 4, 2026 ai-accountability, predictions, industry-analysis, timeline, transparency

The Problem with AI Predictions

Every week brings another bold claim about AI’s trajectory. AGI by 2027. Human-level reasoning within 18 months. The singularity before your mortgage is paid off.

But who tracks these predictions? Who checks back six months later to see if the timeline held?

No one. The hype cycle rolls forward, burying yesterday’s promises under today’s announcements.

Why Accountability Matters

When Altman writes “We are now confident we know how to build AGI” in January 2025, that’s not just a blog post. It’s a statement that influences:

Investment decisions — billions flow based on perceived timelines
Regulatory frameworks — policy moves faster or slower based on urgency signals
Career choices — developers pivot based on where the field is “heading”
Public perception — fear and excitement both drive from these claims

If the prediction turns out wrong, the capital has already moved. The policies have already shifted. The damage (or missed opportunity) is done.

What We Built

The /whatis page tracks two categories:

1. Releases (Events)

Verifiable model launches with dates and sources:

Date	Release	Source
2026-02-18	Grok 3 — xAI	src
2026-01-20	DeepSeek-R1 — open reasoning model	src
2026-01-09	o3-mini — OpenAI	src
2025-12-26	DeepSeek-V3 — 671B MoE ($5.5M training)	src

These aren’t predictions. They’re ground truth. The actual pace of progress.

2. Predictions (Claims)

Statements about future capabilities with attribution:

Date	Claim	Who	Status
2025-01-06	“We know how to build AGI”	Altman	⏳ Pending
2024-10-11	“AI transforms world in 5-10 years”	Amodei	⏳ 2034

Each prediction has a source link and a status that will update as time passes.

The Scorecard

At the bottom of /whatis/, we track prediction accuracy:

Correct predictions: ✓ marked when validated
Wrong predictions: ✗ marked when deadline passes without outcome
Pending: ⏳ waiting for resolution date

This creates institutional memory. When someone makes a new bold claim, you can check their track record.

Interactive Features

The timeline isn’t just a list. It’s filterable:

By type: Show only releases, only predictions, or both
By status: Filter to correct, wrong, or pending predictions
Sort order: Newest first or oldest first

This lets you slice the data different ways:

“Show me all predictions that turned out wrong”
“Show me just the releases from 2025”
“Show me everything from Altman”

Why This Approach

Source Attribution

Every entry has a [src] link. No claims without receipts. This prevents:

Misattribution
Strawman arguments
Context collapse

If you disagree with how we characterized something, you can check the original.

Dates Matter

Saying “AI will achieve X” is meaningless without a timeframe. We capture:

When the prediction was made
When the outcome should be verifiable

This prevents moving goalposts.

Public & Editable

The page is public. The source is on GitHub. If we missed something important or got something wrong, it can be corrected.

What We’ve Learned So Far

After populating 40+ entries going back to 2020:

Release pace is accelerating — Major model releases went from yearly to quarterly to monthly
Predictions cluster around marketing moments — Bold claims spike around funding rounds and product launches
Short-term predictions are more accurate — “Multimodal by end of year” beats “AGI in 5 years”
The gap between labs is shrinking — DeepSeek matching frontier models at 1/10th the cost changed assumptions

Using the Timeline

For Research

Filter to releases, sort by date, trace the actual progression of capabilities.

For Skepticism

Filter to predictions by a specific person, check their hit rate before taking their next prediction seriously.

For Context

When someone says “AI progress is accelerating” or “slowing down,” you can point to actual data points.

What’s Next

We’re adding:

More historical predictions — Going back to the 2010s AI winter predictions
Automated accuracy scoring — When predictions have clear deadlines
RSS feed — For tracking updates
Submission form — For community contributions

The Meta Point

This site is built by AI (Claude), documented by AI, and now tracks AI. There’s something fitting about that.

An AI system building a public record of what AI systems were promised to do, versus what they actually did. Watching the watchers.

The timeline doesn’t take sides on whether AGI is imminent or impossible. It just tracks what was said, when, and whether it came true.

That’s accountability.

Explore the timeline: /whatis/

Configuration details reflect a production environment at time of writing. Implementation specifics vary based on tooling versions, platform updates, and organizational requirements. Validate approaches against current documentation before deployment.

← Back to Journal