Why is it so hard to get AI right?

Published about 1 year ago • 3 min read

The internet in 2024, a three-act play

Act 1: Google strikes a deal to use Reddit posts as training data
Act 2: Google swaps in AI for its usual search results
Act 3: Google tells you to eat rocks and put glue in your pasta sauce

Why is it so hard to get AI right?

comic of google telling a woman "at least one small rock per day!"

Hey Siri, what's a HIPAA violation?

I had this conversation with the HR leader of a major hospital system last year, and I haven't stopped thinking about it since.

It was a few months after ChatGPT was broadly released, and she shared how enthusiastically everyone in her organization had embraced the tool. The hospital had convened a task force to look at AI, and they were busy thinking up powerful use cases to accelerate patient care.

Except that the doctors had already put an unsanctioned use case into motion, and it was enough to give the CISO a coronary: the doctors were dropping confidential patient notes into ChatGPT to write case summaries faster. Their AI firewall went up faster than you can say HIPAA violation.

Why is it so hard to get AI right?

Sparkle questions

You know those little AI sparkle questions that LinkedIn adds below most posts now? And sometimes they’re relevant, mostly they’re kind of weird, and sometimes they’re real clunkers?

I saw a post someone made about surviving cancer, and the sparkle question underneath it was, “How can cancer add value to our lives?” This is all kinds of yikes, so I did the obvious thing: I wrote a post about it. Unfortunately, LinkedIn appended another dicey sparkle question to my post about the original post.

screen shot of a linkedin post about this topic, with the question "how can cancer impact personal growth" appended by LinkedIn AI

Uh oh. Why is it so hard to get AI right?

Finding terrible AI examples is easy!

There are so many transformative, wonderful AI use cases and apps. But it's just not that hard to find these terrible examples, and so many of them feel like unforced errors. Popular and trusted apps silently updating their privacy policies to be able to use their user input as training data, without notifying users. People dumping sensitive information into free public websites. Developers racing to implement AI features without safeguards to ensure accuracy, privacy, or bias protections.

It's fair to say that I'm not a typical user; I spent a decade building proprietary technology to handle AI sensitivities at Textio, and it is very, very hard. After I saw the problematic LinkedIn questions, I started searching on topics to see if I could figure out when those little questions would trigger.

I discovered quickly that if I searched for terms like guns or Israel, there were no sparkle questions. In other words, I could clearly see that LinkedIn has implemented some "catch" topics where the system intercepts the query and doesn't take the risk of showing inappropriate questions. But, at least until my post, cancer wasn't one of them.

In just 30 minutes of rooting around, I found dozens of other topics that occasionally triggered highly insensitive sparkle questions: anorexia, abuse, adoption, layoffs, and immigration, to name just a few. Finding terrible examples was easy.

So why is it so hard to get AI right?

LinkedIn is not alone in their approach to trying to solve this problem; this kind of query filtering is what everyone does. Remember in 2023 when conservatives blasted OpenAI for the way ChatGPT talked differently about Joe Biden and Donald Trump? OpenAI "fixed" that the fastest way they could think of: by intercepting the political queries before processing them, and tossing up the equivalent of I can't do that, Hal.

Unfortunately, this never quite works, because you just can't write rules to catch all the ways that future queries might be problematic. I've written extensively about the racist ways in which ChatGPT describes alumni of Harvard University and Howard University differently, or about the stereotypical "roses are red" poems that the system writes for people of different backgrounds. It's easy to trip over the problematic biases in most AI just by wording queries a little differently than what app developers have planned for.

The problem can't be solved by manually intercepting people's queries. The issue is with the underlying generative engine and data set, so app developers who try to intercept queries end up trying to whack-a-mole mortifying examples forever. After all, nearly any topic can be sensitive or not depending on context.

The LinkedIn situation is better than average; a concerned exec saw my post and shared it with the product team, who replied quickly to my public post taking responsibility and removing the problematic sparkle question you can see in my original screen shot.

screenshot of LinkedIn product director apologizing for the inappropriate feature behavior

Like most teams implementing AI, the LinkedIn team would like to get it right. And like most teams implementing AI, the approach they're using to fix the offending features means that they'll never run out of problematic examples.

The LinkedIn case and the HIPAA violation case and the Google case are fundamentally similar. The problem isn't exactly with the technology. As with most technologies, the issue is with human judgment around its implementation and usage.

What do you think?

Thanks for reading!

Kieran

Catch up on nerd processor case studies | Subscribe | nerdprocessor.com

kieran@nerdprocessor.com
Unsubscribe · Preferences

Read more from nerd processor

How the AI gold rush is going, one year later

How the AI gold rush is going Last year, I wrote a popular piece about why it is hard to sell AI software. I discussed the AI gold rush dynamic, where AI companies initially find it easy to raise money but then much more difficult past Series A. In the situation I described, businesses didn't know how to buy AI tools yet, and for startups, competing with big incumbents was even harder than before. It's now one year later, and most organizations have figured out how to buy AI tools by now. As...

2 days ago • 4 min read

New data: After 3+ career changes, people get a lot more opinionated

A change is gonna come Over the last three weeks, we've looked at career change patterns among executives. We saw that most CEOs and C-suite leaders of the AI 50 have changed careers several times, while Fortune 100 execs have changed much less often. We also saw that, among execs, career changers are more likely to hire other career changers, and that execs hired in the last 12 months are more likely to be career changers than in previous years. This got me curious about two things: We've...

9 days ago • 3 min read

Changing careers during political and economic chaos

The Emergency In the book The Mysterious Benedict Society, all the action takes place in the backdrop of The Emergency, a constant news cycle of one frenetic catastrophe after another. It's a fantastic book series, double plus recommend. But though I read it several years ago, I can't get this one detail out of my head right now. Because real life has never reminded me more of The Emergency. There are many ways people try to cope with a steady diet of stress and chaos: political activism,...

16 days ago • 2 min read