By Sahil Kapoor in AI - 15 May 2025

The Hard Truth About AI and Engineering

LLMs are great at patterns, terrible at cause and effect. If you're leading teams and betting otherwise, you're driving blind.

Before I get into the critique, let's give credit where it's due. AI has taken leaps and bounds in engineering. Just five years ago, the idea that a machine could generate a working React component, optimize a SQL query, or even scaffold a backend service in seconds would have felt like science fiction. Today, that's normal. Tools like Copilot, ChatGPT, and others have massively boosted developer productivity, especially for repetitive or boilerplate tasks. For a junior engineer, AI is like having a senior whispering syntax reminders in real-time. That's progress we can't deny.

Yet here's the hard truth: AI is still not a software engineer.

I keep meeting experienced engineers and even CTOs who are under this impression that LLMs (Large Language Models) have actually learned software engineering. They think AI now "understands" code the way a junior developer does after months of debugging, fixing edge cases, and breaking things in staging. That's just not true.

What LLMs are doing is nothing close to how humans learn. They are probability engines. At the core, these models are token predictors - their entire "skill" is to guess what word, symbol, or line of code should come next based on patterns they've seen in their training data. There's no comprehension, no intent, no mental model of how systems behave in the real world.

LLMs Explained the Way I’d Tell a Friend

Think of an LLM like a parrot that has read the entire internet, millions of books, GitHub repos, Wikipedia articles, and more. It doesn't "know" what any of that means. What it has learned is: when people write certain words, these other words usually follow.

Technically, here's the process boiled down:

The internet is chopped into tiny pieces of text called tokens. A token might be a word, part of a word, or even punctuation.
The model is trained to guess the next token in a sequence, over and over, millions of times.
Through this repetition, the model builds statistical weights - essentially "probabilities" - that help it predict what's likely to come next.

That's it. At the end of the day, all it knows is: if you give me a sentence starting with "Once upon a_", the next likely word is "time." Not because it understands fairy tales, but because it saw that pattern a billion times in training.

When you ask it for code, it does the same thing. It has seen enough Python, Java, and TypeScript online that it can guess what looks like valid code. But guessing what "looks like" code is not the same as understanding how that code will run in your application.

This is why they're called language models. Their training objective is purely about language patterns - not reasoning, not logic, not cause and effect. Calling them "AI" makes them sound magical, but at their heart, they are still just very powerful language models trained on text.

Why AI’s ‘Understanding’ Is Just a Performance

It looks convincing, I get it. You ask ChatGPT to write a Python script for connecting to an API, and it spits out something that looks spot on. You ask Copilot for a React component, and it auto-completes like magic. It feels like the AI knows what it's doing. But it doesn't.

LLMs don't know what "code" is. They don't know what "APIs" are. They don't know why engineering exists. All they've learned is the statistical likelihood that certain patterns of symbols come after others. That's why they can generate boilerplate or fill in repetitive tasks really well. But the minute you need reasoning - real cause and effect reasoning - the cracks show.

Here's a simple example. Ask ChatGPT to generate code for a distributed lock using Redis. The code will compile. It'll even look "okay." But most of the time, it will miss subtle but critical edge cases around expiry, race conditions, or failover. An engineer who's been burned in production by locks failing would never make those mistakes. AI will, because it doesn't know what failure is. It doesn't know production is a thing.

“It’s a bit sad and confusing that LLMs ("Large Language Models") have little to do with language; it’s just historical. They are highly general purpose technology for statistical modeling of token streams. A better name would be Autoregressive Transformers or something.”
Andrej Karpathy

That’s the point: LLMs predict the next token. Useful, but not the same as understanding.

The World Model Gap

Humans build mental models. You know that saving to a database might fail if the disk is full. You know a retry loop without backoff can crash an entire service. You don't learn that by reading text. You learn it by living through outages, debugging, and experience.

LLMs have no world model. They don't know what "disk" means. They don't know what "failure" means. They only know that certain words show up together often enough. This is the fundamental limitation today.

And it's not just code. Even in plain English, LLMs don't "understand" language. Words are just vectors in algebra. When we say "the server went down," it doesn't know that means your users are screaming on Twitter, your pager is buzzing, and revenue is bleeding. It just knows "server" and "down" often appear near each other in text.

So when we say things like "AI understands language" - it doesn't. It's doing math. High-dimensional probability math, yes, but still just math.

Why Tech Leaders Are Getting It Wrong

The scary part is how many tech leaders are betting their future on the belief that AI "knows." I've sat in conversations where someone says, "We don't need to hire as many engineers now, Copilot will handle the heavy lifting." That's like saying Google Translate eliminated the need for diplomats.

If you're a CTO and you don't understand what an LLM really is, you're effectively driving your team blindfolded. Sure, the car moves fast at first. But eventually you'll hit a wall.

One story that keeps popping up in forums: a junior engineer copies code from ChatGPT into production without reviews, and things go sideways. Here’s an example on Hacker News where someone pasted AI-generated SQL migrations that quietly corrupted data. The AI didn’t “know” the schema. It just stitched together plausible queries. Looks legit, until you’re restoring backups at 2 AM.

Another thread worth reading: this GitHub issue where developers discuss Copilot hallucinating functions that don’t exist in the codebase. It feels helpful until you realize the AI invented methods out of thin air. Humans would never do that.

And if you want a real-world cautionary tale at scale, look at Builder.ai’s insolvency. A company that marketed itself as “AI-powered software building” is now bankrupt, leaving customers stranded. The hype ran faster than the reality, and reality eventually won.

What AI Can Do Well

Now don't get me wrong, I'm not dismissing AI. It's incredibly powerful as a multiplier. Think of it as the smartest autocomplete system you'll ever have. It can draft boilerplate, suggest patterns, and remind you of syntax you forgot. For things like writing regex, JSON schemas, or generating API client wrappers - it's a blessing.

But that's not engineering. Engineering is debugging unknown failures. It's designing resilient systems under constraints. It's balancing tradeoffs between speed, cost, and maintainability. It's managing the human side of teams, code reviews, and shipping reliable software. AI today has zero experience in those dimensions.

The Illusion of Progress

There's also this narrative that AI keeps improving and will "soon" cross into real reasoning. I'm not convinced. Fine-tuning, RAG (retrieval-augmented generation), bigger models - they make the outputs look cleaner, more plausible. But under the hood it's still token prediction. There's no hidden emergence of intuition.

We keep projecting human qualities onto machines. We see a polished paragraph and assume intelligence. We see a bug fix suggestion and assume experience. But there's no magician behind the curtain. It's still just algebra.

The irony is - if you really want to use AI well - you need more real engineers, not fewer. People who understand when the AI is hallucinating, when it's cutting corners, when it's introducing fragility. Without that, you're building castles on sand.

Where This Leaves Us

So where does that leave tech leaders? To me the answer is simple: treat AI like a tool, not a teammate. It's closer to StackOverflow autocomplete than a senior engineer. Useful, but dangerous if you rely on it blindly.

I’ve written before about how over-reliance on AI can backfire, especially for juniors in Why I Had to Ask a Junior Dev to Stop Using AI. On the flip side, I’ve also shared how I actively promote AI when used responsibly in Why We Made Copilot Mandatory. Both show the balance that’s needed, and are worth reading alongside this piece.

Hire engineers. Train them. Let them use AI to speed up repetitive tasks. But don't expect AI to reason about complex systems. Not today. Maybe not for a very long time.

AI is going to be positive for engineering. It will automate the boring stuff, act as a tireless assistant, and lower the barrier for newcomers. It might even make engineering more fun again, because we'll spend less time on drudgery and more time on real problem solving. But let's not confuse that with replacement.