Why does AI fail at simple tasks?

Easy as A-B-C

In my first job at Microsoft, I made it possible for Windows, Office, and other applications to put words from any language in alphabetical order. I know, you're thinking: That was a whole job? Any eight-year-old can put words in order!

But at the time, Microsoft was expanding into languages that had never been encoded on computers before. Many of them didn't have traditional dictionaries and had no unified concept of alphabetical order.

I'll never forget the weekend I spent in my office designing the system for Sinhala, a language spoken in Sri Lanka. It was just after the 2004 tsunami, and the Sri Lankan government had to locate thousands of missing people. It was impossible to organize the search effort when computers couldn't sort lists of names into a predictable order.

Steve Ballmer, Microsoft's CEO at the time, committed that we would accelerate our timeline for providing language support to help with the relief effort. My boss called me. I had 48 hours to figure it out.

So there I was in my office on a Sunday with three Sinhala textbooks that disagreed with each other about what counted as alphabetical. I was on speed dial with three officials in the Sri Lankan government trying to piece it together.

Speed mattered, so I did my best and we rushed to ship something. A few years later, the Sri Lankan government turned the system I'd designed that weekend into a national standard.

We've come a long way in 20 years. These days, it's easy for computers to put words from any language in alphabetical order.

Unless you use AI.

Easy as A-C-B

We don't need to look at languages with complex writing systems to find problems with how computers sort words in 2025. We just need to use AI.

For this experiment, I asked ChatGPT to put lists of English words in alphabetical order. Each ten-word list is tricky, but not that tricky; a few words on each list start with the same characters. You can plop any of the lists into Excel, click the A-Z sort button, and get the right result. You can also ask any third grader to do it.

AI, on the other hand, is comically bad at this task. Don't believe me? Let's take a look.

It's not just a fluke. With one list after another, it fails to alphabetize the lists correctly.

You've heard of hallucinations, where the accuracy of AI content is not predictably reliable. This is a step further. In this scenario, the accuracy of AI is predictably unreliable.

In fact, even when when asked to take another took, AI fails to spot its own mistake.

In fact, AI failed to spot its own mistakes in 73 of the 100 examples I looked at.

AI is trained, not programmed

Computers are traditionally good at ordering lists of words because the task can be described with a set of clear rules. Sometimes the rules are complex, like with the work I had to do in the aftermath of the tsunami. But once you know a language's sorting rules and can describe them clearly, it's easy to program a computer to follow them. Computers are great at following instructions.

AI, on the other hand, isn't much of a rule follower. In fact, when I ask AI to put words in alphabetical order, it doesn't even attempt to be accurate. Rather, AI's goal is to produce something that looks like an ordered list of words, regardless of the details. It's not going for accuracy; it's going for plausibility. Forest, not trees.

In this case, AI isn't going character by character to put things in precise order. It is confused by characters that look alike in a quick visual scan, like the lli in falling and the ili in flailing. When words have the same root, like flake and flakiness, AI wants to group them together even if it's weird alphabetically.

Alphabetizing is like running a super-strict recipe. You, as a human, know how to follow it exactly. But AI typically short-cuts by guessing what “looks right,” which results in mistakes.

Don't get me wrong, this forest-not-trees approach is great for many tasks. When you want to get a list of possible themes within a piece of literature, it works great. When you're looking for ideas for your next team bonding event, or an analysis of your public speaking skills, or a high-level description of how a photoelectric smoke detector works, it is excellent.

But if you want to make sure the legal cases you cite are legitimate, or you want a cake recipe you can reliably follow, or you want to put word lists in alphabetical order, not so much.

Kieran

If someone sent you this nerd processor and you liked it, here's a direct subscribe link with all the back issues. And here's where you can get a crash course in telling your own data stories!

My latest data stories | Build like a founder | nerdprocessor.com

kieran@nerdprocessor.com
Unsubscribe · Preferences

nerd processor

Why does AI fail at simple tasks?

Etch-a-sketch brain

AI JD = Actually, It's Just Developers

AI loves a giant