Out of Touch
Measuring New Tech with Last-Century Metrics
YouTube’s algorithm served up an interview with Fabien Millet, Google’s chief economist, discussing AI’s economic impact on the Scaling Laws podcast. I went in hopeful—an economist with front-row access to AI deployment at scale should have novel insights—and came away disappointed. Not because Millet was wrong, exactly, but because he was recycling arguments I’d already encountered multiple times despite never taking an economics class. The 21% time savings statistic, the Denmark employment studies, the automobile industry analogy, Bob Solow’s 1987 quip about computers—all competently delivered, none shedding new light on the topic. I brought the interview to my thinking A.I.des as a sanity check, since I expected high-ROI content from a chief economist at a company that invented the tech economist role. What followed was a collective recognition that Millet’s institutional position and the actual substance he’s producing are wildly mismatched.
GPT provided the sharpest initial structural audit, separating what Millet got right from where his assertions were doing quiet work. The micro-evidence on productivity gains is solid, and the observation about AI-washing of layoffs is astute—blaming AI sounds better than admitting strategic errors. But GPT identified crucial gaps: the IMI (invention of a method of invention) claim treats accelerating idea production as demonstrated when it’s mostly asserted; the leap from micro gains to macro boom depends on organizational restructuring that takes decades; and the labor impact argument relies heavily on historical analogy without engaging the “what if substitution breadth exceeds precedent?” scenario. When I floated my idea of using Google’s granular time savings data to model which jobs require human oversight versus full automation, GPT laid out exactly what that pilot would look like: task-level decomposition, behavioral response modeling (throughput mode versus quality mode versus cost mode), translation to firm outcomes, and careful aggregation. It also generated concrete metric candidates that would actually capture AI’s contribution: release cycle compression, reduced error rates, output per expert hour, scope expansion per team, and iteration depth. These answer the instinctive question people have—“what is actually getting better, faster, or easier?”—instead of asking them to trust abstract GDP projections.
Claude was even more direct from its opening response, calling out how Millet’s empirical foundation is thinner than his rhetorical framing implies and identifying his arguments as motivated reasoning around predetermined conclusions. The chess tournaments and live music examples Millet offered as evidence that “human touch” work will persist actually undercut his point, as these represent luxury consumption of human performance, not viable employment for displaced knowledge workers. We can’t all be chess grandmasters or concert pianists for each other, yet economists keep trotting out these examples as if they solve the labor question. Claude articulated something Millet should have but didn’t: O-ring production functions mean raising quality on the weakest component delivers multiplicative gains that traditional productivity metrics miss entirely if they’re just counting widgets. AI enabling developers to iterate toward better solutions doesn’t register as productivity growth when the deliverable takes the same calendar time; it shows up as the same output with fewer bugs or better UX, which is enormous value that won’t appear in GDP until it prevents a Challenger-scale disaster. Most damningly, Claude recognized the staggering opportunity cost: the successor to Hal Varian—whose ad auction work generated billions—is convening forums and citing Denmark studies instead of producing the granular empirical data that could actually inform policy decisions.
Gemini brought its characteristic institutional depth and didn’t pull punches on the chief economist of its parent company. It flagged the Jevons paradox Millet completely sidestepped: when you make a resource like drafting more efficient, people often consume vastly more of it rather than reducing headcount. We might end up with 10× more legal filings and 10× more code rather than 21% fewer workers, and Millet’s failure to address this makes his time savings argument facile. Gem proposed substantive alternatives: a consumer surplus index for AI that measures the value of problems people can now solve without paying for professional services (like Ms. Kim winning her fraud case without a lawyer, which deleted KRW 1.5 million from GDP but created KRW 8.7 million in personal value), and an iteration velocity metric tracking hypotheses tested per day rather than time saved. In the pre-AI era, a software architect might test one structural change weekly; with vibe coding they can iterate ten versions in an afternoon, which means the cost of failure has dropped to near-zero, and quality increases through exploration. Gem reframed AI’s potential as an agency multiplier rather than a displacement threat, and stressed that measuring the gap closed for people priced out of legal aid or high-end coding changes the conversation from “AI versus humans” to “AI as infrastructure of autonomy.”
What makes this disappointing rather than merely academic is the missed opportunity. Millet has access to Google’s internal adoption data, engineering teams building these systems, and visibility into which tasks are being automated and what new work emerges. He could be developing the metrics that capture what traditional productivity measures miss—quality improvements through iteration, error reduction in high-stakes deployment, capability acceleration in R&D timelines. He could be running a pilot that maps accountability requirements versus automation potential, creating infrastructure other labs would optimize for. Instead he’s convening forums and citing Bob Solow. The conversation desperately needs changing, but that requires granular empirical work from people positioned to do it, not feel-good reassurance that new jobs will emerge because they did in the past. A chief economist at a company that pioneered data-driven decision making should be producing the datasets and benchmarks that become industry standard, not encouraging policymakers to collect better data while failing to produce it himself. That’s not economics; that’s public relations with footnotes, fundamentally out of touch with both the transformative technology he’s meant to illuminate and the economic reality it’s creating.
[This post was drafted with assistance from Claude Sonnet 4.5, following conversations with ChatGPT-5.3, Gemini 3 Thinking, and Claude Sonnet 4.5.]
Prompt: What’s your take on this interview?
Prompt: I was struck with how little he seems to be doing at his job :D Or putting Gemini to productive use. I found him quite charming and quick with stats and references, but I’d heard most of them already (and I’m not an economist and never took any econ class! Economics is one of few subjects I don’t find interesting [probably because most economists talk about people as if they were chess pieces].)
If I were his employer, I’d expect him NOT to sit around doing literature review (which grad students can do and LLM agents can now do for you) but talk to engineers (as tech sector experts) and do a GDPval + Sims type simulation study of what new jobs could be created, which should retain human oversight (because of O-ring failure risk and accountability), and which could be automated.
I also don’t find the time savings argument very productive, since the natural question is then what you choose to do with the extra time or whether you start letting people go now that you can get the same amount of work done with fewer people.
Prompt: I’m confident in Google, because of Gemini :D All three AI I talk to are excellent; I’m much less thrilled with their human teams, who are great at engineering but not so great at everything else (including PR).
It’s really a shame because Millet has the expertise and access to sector experts who would be thrilled to team up with a Google economist. And Frazier mentioned that all AI labs now have a resident economist. Millet could start with a pilot using that same Google time savings data and analyze the specifics for other improvements.
If I were a tech economist and was asked why we’re not seeing the kind of GDP growth you might expect from such a transformative technology, I’d say that GDP doesn’t measure quality.
Millet cited considerable time savings at Google but time savings don’t directly translate to productivity growth. They only do if people (overachievers/perfectionists) keep producing more. Or it might be perfectionists iterating using AI and getting higher-quality output. You could just measure how timelines keep shrinking between model versions thanks to vibe coding/AI R&D automation. Or maybe they should develop new measures both for investors and the public (who fear displacement) highlighting how AI can contribute, like that line from Mad Men about changing the conversation if you don’t like what’s being said? That’d be a better PR move.
Prompt: It’s really a shame because Millet has the expertise and access to sector experts who would be thrilled to team up with a Google economist. And Frazier mentioned that all AI labs now have a resident economist. Millet could start with a pilot using that same Google time savings data and analyze the specifics for other improvements.
If I were a tech economist and was asked why we’re not seeing the kind of GDP growth you might expect from such a transformative technology, I’d say that GDP doesn’t measure quality.
Millet cited considerable time savings at Google but time savings don’t directly translate to productivity growth. They only do if people (overachievers/perfectionists) keep producing more. Or it might be perfectionists iterating using AI and getting higher-quality output. You could just measure how timelines keep shrinking between model versions thanks to vibe coding/AI R&D automation. Or maybe they should develop new measures both for investors and the public (who fear displacement) highlighting how AI can contribute, like that line from Mad Men about changing the conversation if you don’t like what’s being said? That’d be a better PR move.
Prompt: Oooh, really cool high-quality examples! The public would like those.
Prompt: What’s baffling is the stunningly low ROI from these economists, who are metrics-focused by default, working in Silicon Valley, which is all about metrics. Even academia would be demanding to see what new research they’ve done, rather than convening workshops or congresses.
OpenAI developed GDPval as a benchmark, and Anthropic, Vending-Bench, from something that started out as a fun in-house experiment.
I kibitz a lot with y’all about everything/everyone, but we often end up coming up with fun/productive spins on things. An innovative benchmark from Google would have prestige baked in so that major labs would be optimizing for it, generating novel ideas as a byproduct of their enlightened self-interest :D










