Tin can in the sky


Tin can in the sky

A few years ago, I was flying across the country for a business trip and counting on the flight to get some work done. Unfortunately, the wifi was out for the entire trip. When I landed, I complained about it to a coworker.

"Yeah, that's annoying," he said. "On the other hand, you just spent a few hours inside a tin can in the sky, and now you're 3,000 miles away from where you had breakfast this morning. Technology is amazing!"

Take two things from this story. One, it is humbling to be surrounded by optimists. Two, we take groundbreaking technology for granted almost immediately after it becomes available.

The tin can in the sky brought me to NYC in under six hours. Cool! But I also really needed it to have wifi, so I felt disappointed rather than awe-struck.

AI is basically a tin can in the sky

I was thinking about this again the other day when I was unable to use AI to build the slides I wanted. AI can do all kinds of things that few imagined a few years ago. As a result, consumers have high expectations. When AI fails to live up to those expectations, we don't think, "This technology is amazing!" We think, "How hard is it to make a couple of slides?"

I'm seeing this all over the place with AI right now. Here is a short list of things I've heard people complain that AI has failed them at just in the last week alone:

To listen to the testimonials, you would think that AI isn't providing the complainers a whole lot of value. But underneath all the complaints, including mine, is the reality: We notice the failures because we're using AI a whole lot; the vast majority of the time, AI is succeeding or we wouldn't keep counting on it.

For years now, 99+% of my flights have had wifi. I let those trips pass unremarked upon. But I've complained loudly about the tiny fraction where wifi is broken.

User expectations are the only evals that matter

If you're building any kind of software product, especially with AI, you're probably familiar with the concept of evals. If not, think of evals as simple tests that check whether an AI is doing what it’s supposed to do in real situations.

Let's use a non-tech example: If you're writing a recipe for cheesecake, you might ask five home cooks of differing skill levels to bake a cake using the recipe. If each home cook's final cake looks and tastes like the author's cake, the recipe passes the most basic eval: it gives a predictable result.

Evals are especially important in AI products because AI makes stuff up. You only know that a product works reliably when it passes your evals in rigorous and comprehensive tests.

When you have an early product, your users tend to be more tolerant of errors, especially if they're using the product for free. But as your product gets better and more reliable, user expectations go up. What previously seemed amazing becomes table stakes expectation. Like when you fly across the country, you expect the plane will have wifi. User expectations are the only evals that matter.

AI products are in a weird spot right now. I have used AI to write more solid code in the last couple of months than I have in the rest of my life combined. I have completed a hundred previously manual data analysis tasks in a few minutes. AI is a seriously impressive tin can in the sky.

On the other hand, I haven't yet built the agent that makes my slides any good. And because my expectations are sky-high, this makes me cranky.

Kieran


If you liked this story, why not subscribe to nerd processor and get the back issues? Also, why not learn to tell data stories of your own?

My latest data stories | Build like a founder | nerdprocessor.com

kieran@nerdprocessor.com
Unsubscribe · Preferences

nerd processor

Every week, I write a deep dive into some aspect of AI, startups, and teams. Tech exec data storyteller, former CEO @Textio.

Read more from nerd processor

A prehistoric con artist Back before job applicants included bots and North Korean operatives as standard, we inadvertently hired a con artist as Textio's first director of information security. Let's call him Jason, because that is the name he was using at the time. When I say "con artist," I'm not talking about someone who exaggerated their achievements a little bit. I'm talking about someone who fabricated his mother's death, his daughter's kidney failure, and an altercation with an...

Feedback is delicious This moment in uninspired parenting happened in my house last week: Me, to 16yo old: It's just you and me for dinner tonight. What should we have? 16yo: I'm craving sushi. Me: Actually, I'm going to order pizza. See, by the time I asked the 16yo what she wanted for dinner, I wasn't really looking for input. I had already kind of decided to order pizza. Feedback perfection My questionable parenting story aside, if you work long enough and give enough feedback, and you'll...

Everything is easy Many years ago, Daniel Chait, the founding CEO of Greenhouse, tweeted something I never forgot: Everyone else's business looks easy from the outside. At the time I saw the tweet, I was literally thinking, "Damn, we really should have started an ATS company, that seems way easier than what we're doing!" You could not script better irony. I've thought about his comment many times in the years since. It is a simple and elegant truth: everyone else's business does look easy...