Ecosystem Live

AI Learning Curves Across Domains

This is my personal view of how AI technology capability has changed — built by tracking six domains over the past decade and watching the inflection point that ChatGPT created. In Drucker's framework, this is source #7: new knowledge. The technology itself. Understanding how fast it's moving is necessary. But it's not sufficient.

How to Read This

Each domain is scored on a 0–100 Capability Index. This isn't a single benchmark — it's a composite that compresses multiple published benchmarks into one trajectory so you can compare domains side by side. The scores reflect my calibration against the published data; someone else might draw the curves differently.

A concrete example: For Code Generation, an index of 8 in 2018 means GPT-2 could barely produce code snippets. An index of ~90 in 2025 reflects autonomous coding agents completing multi-file tasks — calibrated against HumanEval rising from 0% to 92% and SWE-bench going from <5% to ~49%.

The model uses two growth phases: steady research progress before ChatGPT (Nov 2022), and rapid acceleration after. The two phases stack — the post-ChatGPT curve adds to the pre-ChatGPT baseline rather than replacing it. Select any domain below to see its specific benchmarks and scoring rationale.

What the Data Reveals

Nascent domains leap furthest

Agentic Systems went from near-zero to production use in three years — the steepest acceleration of any domain (3.2x steepness). Before 2023, LLM agents were a research idea. Now they ship software, browse the web, and operate computers. Starting from a low base made the growth curve explosive.

Reasoning changed qualitatively

Reasoning & Math didn't just get better — it changed how it works. Pre-2023 models pattern-matched (GSM8K ~58%). Post-ChatGPT, inference-time compute scaling (o1/o3 thinking tokens) pushed GSM8K to ~95%. The acceleration came from a new approach, not just a bigger model.

The floor rose everywhere

Even Text & Writing, which had the highest pre-ChatGPT baseline, saw its learning rate nearly triple. The post-ChatGPT acceleration isn't just about new domains opening up — established capabilities also shifted gears. Use the Rate Acceleration tab to see this pattern across all six domains.

From Capability to Opportunity

The curves above tell you what AI can do. But Drucker's framework asks a harder question: where should you point it? Technology capability (source #7) becomes innovation only when it's aimed at the six sources above it. Here's what these curves suggest:

Agentic Systems → Process Need (#3)

The steepest curve is in agentic AI. That capability maps directly to Drucker's third source — finding the weak link in a process that everyone has been working around. Agents that can navigate multi-step workflows, use tools, and reason about state are uniquely suited to finding and fixing process bottlenecks.

Reasoning & Math → Incongruities (#2)

The qualitative shift in reasoning means AI can now compare what the data says with what the organization assumes. Financial models vs. actual outcomes. Customer behavior vs. the personas we built. This is Drucker's second source — the gap between reality and perception.

Text & Writing → Changes in Perception (#6)

With near-human language understanding, AI can track how people talk about problems — not just what they say, but how the framing shifts over time. When "remote work" becomes "distributed teams," that's a perception change hiding an innovation opportunity.

Multimodal → The Unexpected (#1)

Systems that can see, hear, read, and reason across modalities are powerful anomaly detectors. They can surface the unexpected — the satellite image that contradicts the supply chain report, the customer video that reveals an unintended use case. Drucker's most reliable source of innovation.

The point isn't to automate innovation. It's to use technology capability (#7) as a lens for finding opportunities in sources #1 through #6. The curves tell you the lens is getting sharper fast. The thesis explains why that distinction matters.

Limitations & Perspective

This is a personal model, not an industry standard. The data reflects my reading of published benchmarks and my experience building with these systems. You should bring your own judgment.

- The Capability Index is illustrative, not authoritative — real progress is multi-dimensional and resists single-number summaries.
- Sigmoid parameters are calibrated by hand against published benchmarks, not fit via regression. Different calibrations would shift the curves.
- Milestone values are approximate placements on the composite index, not direct benchmark scores.
- The 0–100 scale is relative to the modeled window (2015–2026). 100 does not mean “AGI” or “perfect”.

For the full model specification, parameter tables, and benchmark sources, see the Methodology tab in the visualization above.

React TypeScript Plotly