L0 Chapter 2 🥚 🕒 15 min

AI History in 15 Minutes: From Turing to ChatGPT

70 years compressed into 11 moments. Not a timeline — only the events that changed the game.

HelloAI Editors

5/28/2026

If you just want to know how ChatGPT appeared out of nowhere—this one article is enough.

Not a chronological account. Only the 11 moments that actually changed the game. After reading you’ll see: today’s AI wasn’t invented by one person in one year. It’s a 70-year relay across multiple generations.

1950 · Turing asked a different question

Five years after WWII, British mathematician Alan Turing wrote a paper: “Computing Machinery and Intelligence.”

He didn’t ask whether machines can think—he thought that was too philosophical, with no clear definition. He reframed it:

If a human and a machine chat through a curtain, and the human can’t tell which is which—then the machine “thinks”, doesn’t it?

This is the Turing Test. Its brilliance: it turns abstract “intelligence” into something experimentally testable.

🔬 Trivia

Turing predicted in this paper: by 2000, an average person would misjudge a chatbot 30% of the time after 5 minutes. The prediction came true 70 years later, with ChatGPT.

1956 · The field gets a name: AI

Six years later, a summer workshop at Dartmouth College. Organizers were John McCarthy, Marvin Minsky, Claude Shannon (father of information theory)—all in their twenties.

In their proposal they coined the term “Artificial Intelligence”—mainly to distinguish it from the then-popular “Cybernetics.”

AI is born as a field. But everyone was wildly optimistic—they thought “in 10 years we’ll have machines as smart as humans.”

1969 · The first AI Winter

In the 1960s, AI research went two directions:

Symbolic reasoning (if-then rules)
Simple neural networks (“perceptrons”, proposed in 1958)

In 1969, Marvin Minsky (yes, again) co-wrote Perceptrons, mathematically proving: single-layer perceptrons can’t even learn the XOR function.

This book killed US Department of Defense funding for neural networks. For 15 years, neural net research was a wasteland. The first AI Winter.

⚠️ The historical irony

Minsky’s proof was correct—single layers really can’t do XOR. But he didn’t point out multi-layer ones can. This simple truth took almost 20 years to be rediscovered.

1986 · Backpropagation gets rediscovered

In 1986, University of Toronto’s Geoffrey Hinton and colleagues published an algorithm that could train multi-layer neural networks: backpropagation.

The algorithm had been independently invented several times before (going back to the 1960s), but Hinton’s paper got the field’s attention.

If neural nets are a ship, backprop is the engine. Without it, depth doesn’t help.

But compute and data weren’t ready. Neural nets still lagged behind other methods. Hinton later got nicknamed “the man who stayed in the wilderness for 30 years.”

1997 · Deep Blue vs Kasparov

In May 1997, IBM’s chess program Deep Blue beat world champion Garry Kasparov 6-1.

Important detail: Deep Blue was not a neural network. It used “exhaustive search + handcrafted expert rules”—calculating 200 million moves per second.

But it gave the world a psychological signal: machines really can beat humans at certain intelligent tasks.

2006 · “Deep Learning” gets its name

Hinton hadn’t given up. In 2006 he published a paper introducing “deep belief networks”—and used the term “deep learning” for the first time.

The academic response was lukewarm. Mainstream view: neural networks aren’t practical.

But Hinton noticed two things: GPUs and big data. GPUs made training large networks affordable. The internet gave abundance of labeled data. The conditions were assembled. Just one ignition point needed.

2012 · The AlexNet detonation

The ignition came on September 2012.

ImageNet is the hardest image recognition benchmark—hundreds of thousands of images, thousands of classes. Winners from 2010 and 2011 had ~25% error rates.

In 2012, Hinton’s student Alex Krizhevsky built an 8-layer convolutional network (later called AlexNet), dropping the error rate to 16.4%—10 percentage points better than the runner-up.

For academia, this is nuclear-explosion-level. Almost everyone at the conference realized: the rules had changed.

In the following years, deep learning swept everything:

Year	Breakthrough
2013	Speech recognition surpasses human accuracy
2014	GANs invented, machines start “painting”
2015	Image recognition surpasses humans
2016	AlphaGo defeats Go world champion Lee Sedol

💡 Why AlphaGo mattered so much

Go’s search space is 10^100 times larger than chess. Brute-force search can’t crack it. AlphaGo proved that “neural networks + self-play” can solve problems brute force never could. This was AI’s pivot from “compute machine” to “learning machine.”

2017 · Attention Is All You Need

In June 2017, 8 Google researchers published an 8-page paper with a brash title: Attention Is All You Need.

It introduced a new architecture: Transformer. Originally designed for translation, but researchers quickly found something extraordinary:

As long as you have enough data and parameters, capabilities keep growing.

In 2026, this paper has 120K+ citations. It directly birthed everything that followed.

2020 · GPT-3 — the first jaw-drop moment

OpenAI took the Transformer and pushed scale aggressively.

2018, GPT-1: 117M parameters
2019, GPT-2: 1.5B parameters (OpenAI thought too dangerous, didn’t open-source)
2020, GPT-3: 175B parameters

GPT-3 was the first to make even non-AI people say “holy crap.” Write half a story opener, it continues passably. Give it a coding problem, it writes code. Show it legal jargon, it rewrites in plain English.

But because it was API-only with no friendly interface, the average person never felt it.

2022 · ChatGPT — the consumer moment

On November 30, 2022, OpenAI launched a webpage called ChatGPT.

It did one extremely simple, extremely important thing: wrapped GPT in a chat interface.

Result:

1M users in 5 days (Facebook took 10 months)
100M users in 2 months—the fastest-growing consumer product in history

Before ChatGPT, AI was “lab stuff.” After ChatGPT, your grandma uses AI.

2023-2026 · The Era of Large Models

Then everything accelerates to a blur:

March 2023: GPT-4 released, large capability jump from 3.5
July 2023: Meta open-sources Llama 2, open-source ecosystem rises
Late 2023: Anthropic’s Claude enters mainstream
2024: Google Gemini, Anthropic Claude 3 fully challenge GPT-4
2024: AI Agent concept explodes, models start “doing things on their own”
2025: Multimodal (image/audio/video) fully matures, Sora-class video models become practical
2026: 2M+ token context windows, coding ability surpasses most junior developers

We’re sitting in the middle of this era. Looking back in 10 years, we’ll probably say “those years, one year felt like ten years of the 1990s.”

5 sentences to remember

AI isn’t a technology, it’s a 70-year aspiration. From Turing to ChatGPT, generation after generation.
2012 AlexNet is the watershed. Before it, neural nets were fringe; after, they’re orthodoxy.
2017 Transformer is the infrastructure. All today’s large models build on it.
2022 ChatGPT wasn’t a tech breakthrough, it was a product breakthrough—wrap GPT in a chat box, and the world goes nuts.
We’re at AI’s “printing press moment”. The comparison isn’t hyperbole—the historical scale is plausibly that large.

📝 Want to dig deeper into a person

Two documentaries: AlphaGo (Netflix, about the 2016 Go match) and The Inventor (about early AI figures, HBO). For text, recommend Cade Metz’s Genius Makers.

Next: “What AI Can and Can’t Do — A Truthful Inventory”