The Strawberry Problem: Why AI Can Pass the Bar Exam But Can’t Count Letters

If you want to have some fun at the expense of a multi-billion-dollar artificial intelligence model, open a chat window right now and type a simple question: “How many Rs…

If you want to have some fun at the expense of a multi-billion-dollar artificial intelligence model, open a chat window right now and type a simple question:

“How many Rs are in the word strawberry?”

Unless you are using a brand-new model specifically updated to catch this trick, there is a very high chance the AI will look you dead in the digital eye and confidently reply: “There are 2 Rs in the word strawberry.”

It feels like the ultimate “gotcha” moment. We’ve been told this technology is poised to revolutionize medicine, automate programming, and rewrite the workforce. It can pass the uniform bar exam, decode ancient texts, and write complex software from scratch. Yet, it gets defeated by a third-grade spelling bee question.

Is the AI secretly dumb? Not exactly. The “Strawberry Problem” highlights a fascinating truth about modern technology: AI does not read the way humans do.

The Illusion of Reading

When humans look at a word on a screen, we see individual letters arranged in a sequence: s-t-r-a-w-b-e-r-r-y. We can count the letters, rearrange them, or focus on a single character at will.

Because AI interacts with us through fluent, natural text, we naturally assume it sees those same letters. But it doesn’t. AI models are entirely blind to individual letters because of a process called tokenization.

Before an LLM can process any text, it has to chop that text up into smaller, digestible pieces called “tokens.” A token isn’t a letter, and it isn’t always a full word. It’s a chunk of characters or a common syllable that the AI translates into a number.

Chunks, Not Letters

Think of tokenization like a barcode scanner at a grocery store. The scanner doesn’t care about the font, the ink, or the individual lines on the package; it just reads the barcode as a single numeric identifier.

To an AI, the word strawberry isn’t a collection of ten letters. Depending on the model’s specific dictionary, it might break the word down into two tokens: straw and berry.

Plaintext

Human Sees:  s  t  r  a  w  b  e  r  r  y
AI Sees:     [ Token: 11452 ]  [ Token: 8341 ]

Each of those tokens is assigned a specific number in the AI’s massive mathematical library. When you ask the AI to count the “Rs” in strawberry, it isn’t looking at the letters “s-t-r-a-w-b-e-r-r-y.” It is looking at the numbers 11452 and 8341.

Because it can’t natively see inside the token to count the individual characters, it has to guess based on the probability of what it thinks is inside those chunks. And statistically, it often gets it wrong, guessing two instead of three.

Why This Matters for Your Work

The Strawberry Problem isn’t just a funny parlor trick; it explains a lot of the common, frustrating limitations you run into when using AI for everyday tasks.

Once you realize that AI sees chunks of text rather than individual letters, its weaknesses suddenly make perfect sense:

  • Why it struggles with rhymes and poetry: Because AI doesn’t hear syllables or see trailing letters, it doesn’t intuitively know what words rhyme. It has to rely on statistical data where humans have explicitly stated “Word A rhymes with Word B.”
  • Why it messes up acronyms and anagrams: If you give an AI a phrase and ask it to make an anagram out of the letters, it will often hallucinate or fail because it’s trying to scramble math equations, not alphabet blocks.
  • Why character counts fail: If you ask an AI to write a bio that is “exactly 280 characters long,” it will almost always miss the mark. It can only guess the length based on how many tokens it used.

Fixing the Blind Spot

Engineers are actively trying to fix this. Newer, reasoning-focused models are being taught to use a “Chain of Thought” method, where the AI secretly writes a quick piece of Python code to count letters for it behind the scenes before answering you.

But until that’s standard everywhere, you can help the AI bypass its token blindness by forcing it to break the words down yourself.

The next time you need an AI to do precise character, spelling, or syllable work, try this prompt adjustment:

Instead of: “How many Rs are in strawberry?”

Try this: “Spell out the word strawberry, putting a hyphen between every single letter. Then, count how many times the letter ‘r’ appears in that list.”

By forcing the AI to output the individual letters one by one, you force it to turn those letters into separate tokens. Suddenly, it can “see” them—and it will correctly count all three.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *