Why does AI fail at simple tasks?


Easy as A-B-C

In my first job at Microsoft, I made it possible for Windows, Office, and other applications to put words from any language in alphabetical order. I know, you're thinking: That was a whole job? Any eight-year-old can put words in order!

But at the time, Microsoft was expanding into languages that had never been encoded on computers before. Many of them didn't have traditional dictionaries and had no unified concept of alphabetical order.

I'll never forget the weekend I spent in my office designing the system for Sinhala, a language spoken in Sri Lanka. It was just after the 2004 tsunami, and the Sri Lankan government had to locate thousands of missing people. It was impossible to organize the search effort when computers couldn't sort lists of names into a predictable order.

Steve Ballmer, Microsoft's CEO at the time, committed that we would accelerate our timeline for providing language support to help with the relief effort. My boss called me. I had 48 hours to figure it out.

So there I was in my office on a Sunday with three Sinhala textbooks that disagreed with each other about what counted as alphabetical. I was on speed dial with three officials in the Sri Lankan government trying to piece it together.

Speed mattered, so I did my best and we rushed to ship something. A few years later, the Sri Lankan government turned the system I'd designed that weekend into a national standard.

We've come a long way in 20 years. These days, it's easy for computers to put words from any language in alphabetical order.

Unless you use AI.

Easy as A-C-B

We don't need to look at languages with complex writing systems to find problems with how computers sort words in 2025. We just need to use AI.

For this experiment, I asked ChatGPT to put lists of English words in alphabetical order. Each ten-word list is tricky, but not that tricky; a few words on each list start with the same characters. You can plop any of the lists into Excel, click the A-Z sort button, and get the right result. You can also ask any third grader to do it.

AI, on the other hand, is comically bad at this task. Don't believe me? Let's take a look.

It's not just a fluke. With one list after another, it fails to alphabetize the lists correctly.

You've heard of hallucinations, where the accuracy of AI content is not predictably reliable. This is a step further. In this scenario, the accuracy of AI is predictably unreliable.

In fact, even when when asked to take another took, AI fails to spot its own mistake.

In fact, AI failed to spot its own mistakes in 73 of the 100 examples I looked at.

AI is trained, not programmed

Computers are traditionally good at ordering lists of words because the task can be described with a set of clear rules. Sometimes the rules are complex, like with the work I had to do in the aftermath of the tsunami. But once you know a language's sorting rules and can describe them clearly, it's easy to program a computer to follow them. Computers are great at following instructions.

AI, on the other hand, isn't much of a rule follower. In fact, when I ask AI to put words in alphabetical order, it doesn't even attempt to be accurate. Rather, AI's goal is to produce something that looks like an ordered list of words, regardless of the details. It's not going for accuracy; it's going for plausibility. Forest, not trees.

In this case, AI isn't going character by character to put things in precise order. It is confused by characters that look alike in a quick visual scan, like the lli in falling and the ili in flailing. When words have the same root, like flake and flakiness, AI wants to group them together even if it's weird alphabetically.

Alphabetizing is like running a super-strict recipe. You, as a human, know how to follow it exactly. But AI typically short-cuts by guessing what “looks right,” which results in mistakes.

Don't get me wrong, this forest-not-trees approach is great for many tasks. When you want to get a list of possible themes within a piece of literature, it works great. When you're looking for ideas for your next team bonding event, or an analysis of your public speaking skills, or a high-level description of how a photoelectric smoke detector works, it is excellent.

But if you want to make sure the legal cases you cite are legitimate, or you want a cake recipe you can reliably follow, or you want to put word lists in alphabetical order, not so much.

Kieran


If someone sent you this nerd processor and you liked it, here's a direct subscribe link with all the back issues. And here's where you can get a crash course in telling your own data stories!

My latest data stories | Build like a founder | nerdprocessor.com

kieran@nerdprocessor.com
Unsubscribe · Preferences

nerd processor

Every week, I write a deep dive into some aspect of AI, startups, and teams. Tech exec data storyteller, former CEO @Textio.

Read more from nerd processor

Secret agents Last year, I wrote that more than 75% of the AI startups I saw were explicitly pitching job replacement in their fundraising decks (but not always in their sales decks). The majority of these were building some kind of agentic AI. Fast forward to today, and where are we? AI agent as change agent Agentic AI is designed to act autonomously to complete tasks without continuous human oversight. It is typically focused on completing a domain-specific task. For instance, agentic AI...

Turtles all the way down Every big mistake I've made in my career has been a people mistake. The same is true for every leader I know. Contrary to what you might think, the rise of AI is going to make this truer than ever. Case study lies It's common for business school students to analyze business successes and failures. The goal is to identify the patterns that drive success and failure respectively. Typically, B-school case studies focus on the most impactful levers, such as new products,...

Fight, flight, or freeze They say that most of us, when faced with difficult conflict, tend to fight, flight, or freeze: Fighters are energized by the conflict and dive in to hash it out directly People who take flight disengage at the first sign of discord, trying to avoid the conflict entirely Freezers become mentally stuck, unable to take action while waiting for the conflict to pass For most of my life, I've been a fighter. As an adult, I've had to work to channel that instinct into...