Tin can in the sky


Tin can in the sky

A few years ago, I was flying across the country for a business trip and counting on the flight to get some work done. Unfortunately, the wifi was out for the entire trip. When I landed, I complained about it to a coworker.

"Yeah, that's annoying," he said. "On the other hand, you just spent a few hours inside a tin can in the sky, and now you're 3,000 miles away from where you had breakfast this morning. Technology is amazing!"

Take two things from this story. One, it is humbling to be surrounded by optimists. Two, we take groundbreaking technology for granted almost immediately after it becomes available.

The tin can in the sky brought me to NYC in under six hours. Cool! But I also really needed it to have wifi, so I felt disappointed rather than awe-struck.

AI is basically a tin can in the sky

I was thinking about this again the other day when I was unable to use AI to build the slides I wanted. AI can do all kinds of things that few imagined a few years ago. As a result, consumers have high expectations. When AI fails to live up to those expectations, we don't think, "This technology is amazing!" We think, "How hard is it to make a couple of slides?"

I'm seeing this all over the place with AI right now. Here is a short list of things I've heard people complain that AI has failed them at just in the last week alone:

To listen to the testimonials, you would think that AI isn't providing the complainers a whole lot of value. But underneath all the complaints, including mine, is the reality: We notice the failures because we're using AI a whole lot; the vast majority of the time, AI is succeeding or we wouldn't keep counting on it.

For years now, 99+% of my flights have had wifi. I let those trips pass unremarked upon. But I've complained loudly about the tiny fraction where wifi is broken.

User expectations are the only evals that matter

If you're building any kind of software product, especially with AI, you're probably familiar with the concept of evals. If not, think of evals as simple tests that check whether an AI is doing what it’s supposed to do in real situations.

Let's use a non-tech example: If you're writing a recipe for cheesecake, you might ask five home cooks of differing skill levels to bake a cake using the recipe. If each home cook's final cake looks and tastes like the author's cake, the recipe passes the most basic eval: it gives a predictable result.

Evals are especially important in AI products because AI makes stuff up. You only know that a product works reliably when it passes your evals in rigorous and comprehensive tests.

When you have an early product, your users tend to be more tolerant of errors, especially if they're using the product for free. But as your product gets better and more reliable, user expectations go up. What previously seemed amazing becomes table stakes expectation. Like when you fly across the country, you expect the plane will have wifi. User expectations are the only evals that matter.

AI products are in a weird spot right now. I have used AI to write more solid code in the last couple of months than I have in the rest of my life combined. I have completed a hundred previously manual data analysis tasks in a few minutes. AI is a seriously impressive tin can in the sky.

On the other hand, I haven't yet built the agent that makes my slides any good. And because my expectations are sky-high, this makes me cranky.

Kieran


If you liked this story, why not subscribe to nerd processor and get the back issues? Also, why not learn to tell data stories of your own?

My latest data stories | Build like a founder | nerdprocessor.com

kieran@nerdprocessor.com
Unsubscribe · Preferences

nerd processor

Every week, I write a deep dive into some aspect of AI, startups, and teams. Tech exec data storyteller, former CEO @Textio.

Read more from nerd processor

The prompt industrial complex These days, AI can produce most common workplace artifacts. From emails to slide decks to financial models to working code, a workable first draft is often just one prompt away. Ok, maybe not one prompt away. It usually takes a few tries to describe exactly what you want. If you care about quality, getting a credible artifact typically involves several prompt <> production cycles. AI refines its output based on how you evolve your prompt, and you refine your...

Prognostication vs. reality I spent 2025 as Founder in Residence at Operator Collective, a venture firm whose LPs include many of the most successful tech leaders of the last decade. During that year, I spent time not only with hundreds of AI startups but also with these operators. For me, these discussions informed several predictions about AI that are now coming true, So I was extra interested in the data set that Operator Collective dropped last week. They surveyed the operators in tech...

I pretend to be a night owl so I can have interesting conversations I am a morning person. No matter what time I go to bed at night, I wake up before 6am. I don't plan it. It just is the way my body wants to work. It turns out that being a morning person is a lucky life hack: in theory, morning people are likely to be happier and healthier. I don't know how the science on that will bear out over time, but there's no doubt that it's an advantage to have your best energy just as your work or...