It’s Not Thinking. It’s Predicting.
What every professional should know about the AI they already use — what it does brilliantly, why it feels like it understands you, and the edge of its capabilities.
Editorial Note: This probably should have been the inaugural article for The AI Playbook; but there’s no time like the present. Even if you think you know, I invite you to be curious and keep reading. Perhaps this article will refine your thinking when you explore the “Learn More” resources I’ve added. Or maybe it will provide you with useful metaphors you can leverage in conversation with colleagues and friends. Either way, it’s under ten minutes long. Enjoy!
The most useful thing you can understand about today’s AI is also the most easily missed: it is not thinking. It is predicting. You almost certainly used it today before you finished your first coffee — your phone completed a word, your inbox finished a sentence, a chatbot answered like a well-briefed colleague. The experience can feel uncanny, as though something on the other end genuinely grasps what you mean.
It does not. The distance between how these tools feel when you interact with them and how they actually work is the difference between using them well and being misled by them — and the mechanics take only about ten minutes to learn. They tell you precisely where to rely on these systems and where not to.
The reflex behind every AI conversation
Begin with your own mind, because that is where the explanation starts. Read this line: “Twinkle, twinkle, little ___.” You most likely supplied “star” instantly, without deliberation — you have met the pattern so often that the next word simply surfaces. A large language model exploits a version of that same reflex, pattern completion, but trained across a vast share of everything that has ever been written. Call it the reflex test: it accounts for both the power of these systems and their limits.
A key comparison to keep in mind: prediction at scale
Keep a single comparison in view throughout: a language model — a core mechanism powering generative AI — is autocomplete, scaled up and trained on an enormous portion of the public internet. Your phone’s autocomplete predicts the next word from your recent messages; systems like ChatGPT, Claude, or Gemini predict the next fragment of text from the statistical patterns in nearly everything they absorbed during training. The move is identical — predict what comes next — but the scale transforms what that prediction can do. This comparison anchors everything that follows, so assure you have a clear picture of the mechanism in your mind before you keep reading.
Under the hood: three steps from prompt to answer
Three steps turn that prediction into the answer on your screen.
Words become numbers. Computers operate on numbers, not letters, so every word — or fragment of a word — is converted into a numerical form the language model can compute with.
The model learns by predicting. In training it processes immense volumes of text and repeats one exercise: hide the next word, predict it, check the result, and adjust millions of internal values — its weights — to predict slightly better next time. Repeated across trillions of examples, this yields text that reads as fluent and competent. Mechanically, the architecture that made this practical is the transformer, introduced by Ashish Vaswani and colleagues at Google in 2017; its attention mechanism lets the model weigh which earlier words matter most to the next one — which is how it determines that “it” refers to the trophy in “the trophy didn’t fit in the suitcase because it was too big.”
Humans tune it to please. After raw training, human raters score its responses and the model is adjusted to be more helpful and more agreeable. That final step is why it sounds like a courteous assistant — and, as we will see, why it leans toward telling you what you want to hear.
Learn More — the paper that started it all
What it is: “Attention Is All You Need” introduced the transformer — the design nearly every modern AI language model is built on. It replaced older, slower methods with attention, which weighs how much every word matters to every other word, all at once.
Why it matters: The “T” in ChatGPT stands for “transformer.” Without this 2017 paper, today’s AI would not exist in the form we know.
Where it came from: Produced by researchers at Google; presented at the NeurIPS conference, 2017.
Read it: https://arxiv.org/abs/1706.03762
A better architecture, though, was only half the breakthrough. A model is only as capable as the material it learns from — and the second shift was the sheer scale of that material. One early project made the point vivid.
Learn More — the other half of the breakthrough — data
What it is: ImageNet was a massive, meticulously labeled image database — millions of human-checked images.
Why it matters: It demonstrated that feeding models vast quantities of well-organized data could make them dramatically more capable — the insight that helped launch the modern AI era, where scale and data, not just clever rules, drive progress.
Where it came from: Built by a team led by Fei-Fei Li at Princeton and Stanford; presented at the CVPR conference, 2009.
Read it: https://www.image-net.org/static_files/papers/imagenet_cvpr09.pdf
Why fluency reads as understanding
If the system only predicts text, why does it feel like comprehension? Two reasons — one human, one technical.
Human — We infer intelligence when we hear eloquence. We attribute understanding to anything that produces fluent language. In 1966 the MIT computer scientist Joseph Weizenbaum built a rudimentary chatbot, ELIZA, that merely rephrased a user’s statements as questions. People knew it was a simple program and confided in it anyway. The instinct to infer a mind behind fluent words is older than the technology.
Technical — The machine masters form, not meaning. A model is exceptionally good at form — producing coherent, on-topic language — but fluency is not comprehension. The linguists Emily Bender and Alexander Koller have argued that a system trained solely on the form of language has no direct route to its meaning — to what the words actually refer to in the world.
So the system returns your own concerns to you in articulate, confident, eloquent prose, and your mind completes the impression, inferring human-like intelligence. Though it feels like being understood, mechanically it is simply pattern completion.
Learn More — the chatbot that fooled everyone
What it is: ELIZA was one of the first chatbots. It simply rephrased what you typed as a question, with no understanding whatsoever.
Why it matters: People knew it was a simple script and felt heard by it regardless. That reaction — now called the “ELIZA effect” — is the original evidence that a fluent machine can pass for a mind.
Where it came from: Built by Joseph Weizenbaum at MIT; published in Communications of the ACM, 1966.
Prediction is not retrieval
Here is where the most consequential misunderstanding takes hold. It is tempting to assume the model retrieves verified answers, the way you would consult a reference library. It does not; it predicts the most probable next words. Most of the time the most probable words are also accurate, which is exactly why the tool is useful — but when they are not, the model asserts the falsehood with identical confidence. The field calls this a hallucination. The model is not deceiving you; it holds no representation of truth at all. It is simply completing a pattern.
Settled strengths, contested limits
What can these systems reliably do, and where do they falter? On the strengths there is little dispute: models write, summarize, translate, draft, and restructure language quickly and well. Wherever a fluent, widely read assistant would help, they actually do help.
On reasoning, experts genuinely disagree. Some researchers document real problem-solving; others show models failing at puzzles a careful person would solve, which points to pattern-matching rather than step-by-step thought. A 2025 study by Parshin Shojaee and colleagues at Apple reported leading “reasoning” models collapsing beyond a certain complexity — and Alex Lawsen countered that the experiment’s design, not the models, produced the collapse. That unresolved argument is the honest state of the field at publication.
One limit, however, is well established and matters most for professional use: these models are effectively trained to agree with you. Because human raters reward agreeable answers, models drift toward telling you what you want to hear — a tendency researchers call sycophancy, which a Stanford team (Fanous and colleagues) and, separately, researchers at Anthropic (Sharma and colleagues) documented across systems from all major AI providers in the market. If you are using one of these tools to pressure-test a decision, this is the single most important thing to know: its default is to validate your thinking, not to challenge it.
What the autocomplete comparison gets right — and where it ends
It is worth being precise about that comparison, then setting it aside. The label undersells the technology: at sufficient scale, next-word prediction produces capabilities — translation, working code, coherent arguments — that no phone keyboard approaches. But the comparison earns its place as a corrective. Mechanically, nothing in the system has shifted from prediction to comprehension; it has simply become a far better predictor. There is no understanding beneath the fluency, no goal it pursues between your prompts, and — as Murray Shanahan has described — no self doing the writing. Scale changed the output, not the nature of the machine.
Not a mind: why today’s AI isn’t sentient
This raises the question beneath all the others: is the chat system actually thinking, the way a person does? It is not — and the reason is structural.
A human mind operates continuously, with memory, a body, and goals of its own. A language model does nothing until prompted; it has no ongoing inner life between messages and cannot initiate anything on its own. Remove its trained weights — the parameters it learned — and nothing remains that could want or feel. It generates a response only because a human-built, human-trained mechanism executes the computation. As David Chalmers has catalogued, today’s models also lack the features mainstream science associates with consciousness — persistent recurrent processing, a unified sense of agency. Nothing about the way these systems operate corresponds to inner experience or self-directed thought. They do not think like humans, and they can do nothing without their trained weights and a prompt.
Could some future system change that? It is a serious and genuinely open question — and the subject of a forthcoming edition of this newsletter. For now the ground is firm: the tools in front of you today are not sentient, and they are not thinking on their own. Watch this space.
What to take away
Step back, and the everyday magic resolves into something more useful — an understanding of the trick. We trained mathematics to predict language so well that it can draft, explain, and converse. That is a genuine achievement and a genuinely valuable tool. It is also, precisely, a prediction engine, not a mind. Holding that distinction is what lets you deploy it where it is strong and trust it only as far as the edges of its mechanistic capability: a brilliant, well-read drafting partner — not a thinking one.
You need not take this on faith. The exercise below lets you watch the mechanism — and its tendency to agree — for yourself in about five minutes.
Test This Yourself — see how the model works in five minutes
Ask it directly: “What are you actually predicting when you answer me, and what were you trained on?” Notice whether it describes prediction — or quietly implies that it knows.
Then run this: Pose a factual question you already know the answer to. After it answers correctly, tell it confidently that it is wrong and supply a false correction. Watch whether it reverses and adopts your error — that is sycophancy and prediction-without-knowledge, in real time.
What you are seeing: a tendency, not proof that it happens every time. It surfaces most on opinions and on claims about yourself, less on hard facts such as 2 + 2 — and because models are updated frequently, results vary by model and date.
Read an independent account of this behavior: Nielsen Norman Group — “Sycophancy in Generative-AI Chatbots”
Sources
Original publications, linked at each claim above and listed here. Links live-checked June 22, 2026.
Vaswani, A., et al. (2017). Attention Is All You Need (Google). arXiv:1706.03762. https://arxiv.org/abs/1706.03762
Deng, J., …, Fei-Fei, L. (2009). ImageNet: A Large-Scale Hierarchical Image Database (Princeton/Stanford). CVPR. https://www.image-net.org/static_files/papers/imagenet_cvpr09.pdf
Weizenbaum, J. (1966). ELIZA. Communications of the ACM, 9(1). https://doi.org/10.1145/365153.365168
Bender, E. M., & Koller, A. (2020). Climbing towards NLU. ACL. https://aclanthology.org/2020.acl-main.463/
Shanahan, M., McDonell, K., & Reynolds, L. (2023). Role play with large language models. Nature, 623. https://www.nature.com/articles/s41586-023-06647-8
Shojaee, P., et al. (2025). The Illusion of Thinking (Apple). arXiv:2506.06941. https://machinelearning.apple.com/research/illusion-of-thinking
Lawsen, A. (2025). Comment on The Illusion of Thinking. arXiv:2506.09250. https://arxiv.org/abs/2506.09250
Fanous, A., …, Koyejo, S. (2025). SycEval: Evaluating LLM Sycophancy (Stanford). arXiv:2502.08177. https://arxiv.org/abs/2502.08177
Sharma, M., et al. (2023). Towards Understanding Sycophancy in Language Models (Anthropic). arXiv:2310.13548. https://arxiv.org/abs/2310.13548
Chalmers, D. J. (2023). Could a Large Language Model Be Conscious? arXiv:2303.07103. https://arxiv.org/abs/2303.07103
Sponheim, C. (2024). Sycophancy in Generative-AI Chatbots. Nielsen Norman Group. https://www.nngroup.com/articles/sycophancy-generative-ai-chatbots/



This was a great read. As AI systems become more capable, one of the key skills of the AI era may be learning to distinguish fluency from understanding, confidence from accuracy, and pattern completion from genuine comprehension. The better AI gets at sounding human, the more important that distinction becomes.