The Future of Software Engineering: What I Know and How I Know It

25 May 2026, 00:00

ai / software-engineering / future / agents / career / management / leadership

A recent industry retreat of senior engineering practitioners published their findings on where software development is heading. Reading it felt like someone had been reading my mind — and the minds of the people I follow online — and synthesized it all into one document. When a lot of people arrive at the same conclusions independently, that’s signal worth paying attention to. Here’s my take on each major theme.

The engineer who no longer writes every line — but whose judgment is the thing that makes the whole system work.

1. Where Does the Rigor Go?

The most important question in software engineering right now isn’t “can AI write code?” It can. The question is: when AI writes the code, where does engineering discipline go?

The answer is that it doesn’t disappear — it migrates. Specifically, it moves upstream and into the design and infrastructure around the code.

The mechanical act of translating requirements into syntax is increasingly handled by AI. A skilled engineer can generate more than 90% of production-ready code through prompting. Ben Congdon’s drafter analogy is apt: CAD didn’t eliminate draftspeople, it shifted the work upstream toward design judgment and away from manual execution. That’s what’s happening to software engineering now. What remains valuable: understanding code at a gears level, foundational CS knowledge, and software taste — knowing when a solution is elegant versus merely functional. These become more important when the typing is automated. What disappears is the economics of being purely a coder — there’s no longer a viable market for software craftsmanship at scale, and I wrote about where that leaves most engineers in Coding is Solved (And That’s Fine).

The rigor migrates to four specific places:

Into specifications. Drew Breunig built a software library with no code at all — just specs and tests. The AI generates the implementation in whatever language you want. If the spec is wrong, the code is wrong. If the spec is right, the code is right. The spec is now the highest-leverage artifact in the system.

Into test suites. TDD produces dramatically better results with AI coding agents than any other practice I’ve found. Write your tests first, let the AI generate the implementation, and you have an immediate verification system for non-deterministic output. The tests become your spec. The generated code is expendable — if it passes the tests, it’s acceptable. More importantly, TDD prevents a specific failure mode: agents writing tests that verify their own broken behavior. When the tests exist before the code, that cheat is impossible. This is the core argument in Embrace AI Pairing, which I wrote before most people were taking AI coding seriously. The ThoughtWorks retreat heard the same thing from practitioners at major tech companies: “I’ve gotten better results from TDD and agent coding than I’ve ever gotten anywhere else, because it stops a particular mental error where the agent writes a test that verifies the broken behavior.” Multiple independent data points landing on TDD as the key discipline for agent-generated code is not a coincidence.

Into feedback infrastructure. Banay’s argument about back pressure nails this: type checkers, linters, automated test runners, browser automation — without that infrastructure, you become the bottleneck, spending your judgment on trivial corrections the agent could have caught itself. The value of a good type system just went up dramatically. It doesn’t just catch your mistakes; it catches the agent’s. Which raises a pointed question: why are we writing AI-generated code in dynamically typed languages at all? Research on LLM code generation shows that 94% of LLM-generated code errors are type-related — the exact class of errors a static type system catches at compile time. Python is the language LLMs know best and generate most fluently, but Python’s dynamic typing means those errors reach runtime instead of being caught early. As I explored in Programming Languages in 2026, this is reshaping how I think about language choice: TypeScript has overtaken Python as the most popular language on GitHub, and a big part of why is that it catches the class of mistakes agents make most often. My own preference is Go: statically typed, brutally simple syntax, single-binary deployment, and LLMs generate it cleanly. As I wrote in Why Go for Robotics?, LLMs will happily generate Go all day — ask them for complex C++ and they start hallucinating memory leaks and template metaprogramming nightmares. The simplicity that makes Go easy for humans to read and review turns out to make it easy for agents to generate correctly too.

Into architectural judgment. When code is cheap, the scarce resource is knowing what to build and how the pieces fit together. The most precious asset is no longer the code. It’s the judgment about what to build — and that judgment can’t be prompted. In SaaS is Dead, Long Live Platforms, I argued this is reshaping the entire software business model, not just how individual engineers work.

“Engineering quality doesn’t disappear when AI writes code. It migrates to specs, tests, constraints, and risk management.” — ThoughtWorks Retreat Report, p.3

That quote came from a room full of senior engineering practitioners from major technology companies — people who arrived at this conclusion independently, through their own experience building and running large systems. This isn’t a fringe take. It’s the emerging consensus of the people who are actually doing this work.

2. The Middle Loop: A New Kind of Engineering Work

Software development has always had two loops. The inner loop is the personal cycle of writing, testing, and debugging. The outer loop is CI/CD, deployment, and operations.

There’s a third loop forming between them. The ThoughtWorks retreat report actually named it — calling it the “middle loop” and flagging it as their “strongest first-mover concept,” noting that “nobody in the industry has named this yet.” The fact that senior practitioners at major tech companies are independently identifying the same gap is exactly the kind of convergence that tells you something real is happening.

The engineering work that matters now is directing agents, decomposing problems into agent-sized packages, evaluating output quality, and maintaining architectural coherence across many parallel streams of AI-generated work. This isn’t coding. It isn’t traditional management. It’s something new — supervisory engineering. The managers who turned themselves into human middleware — receive requirements, decompose into tickets, run standup, report status — those roles are going away. AI is very good at being middleware, and I made that case as bluntly as I could in The End of the Glorified Babysitter.

What AI cannot replace is the engineering leader who understands the system deeply enough to know when an agent is confidently wrong, who can identify the impediments that slow both human and agent work, and who can make judgment calls in the absence of complete information.

What this looks like in practice: I sat down with Claude Code on a Saturday morning with a complex, years-old Node.js codebase that had been puzzling my team for weeks. Four hours later I had a complete architectural analysis — component relationships, data flow patterns, the places where the design made sense and the places where it was going to break us. Four hours versus four weeks. Not because I’m especially good with AI tools, but because I’m experienced enough to ask the right questions and recognize when the answers are wrong. I wrote up the full story in Get Stuff Done. That’s the middle loop: experience amplified, judgment applied at speed.

The bottleneck is no longer the model — it’s the workflow. Engineers who invest in how they use these tools have a compounding advantage over those who treat the model as a magic box you talk to, and Tooling Is the New Model is where I worked through what that investment actually looks like.

“The practitioners who are excelling at this new work tend to share certain characteristics: they think in terms of delegation and orchestration rather than direct implementation. They have strong mental models of system architecture. They can rapidly assess output quality without reading every line.” — ThoughtWorks Retreat Report, p.5

Those aren’t skills most engineering career ladders explicitly develop or recognize. They’re skills that experienced engineers often have but rarely name. The retreat’s finding is that these — not coding speed — are what separates the engineers who will thrive in the next five years from those who won’t.

3. Agent Topologies: Conway’s Law Didn’t Retire

Conway’s Law says systems mirror the communication structures of the organizations that build them. It now applies to agents too, and this gets complicated fast.

When agents can clear backlogs in hours, the bottleneck shifts from engineering capacity to organizational dependencies. You give a team AI tools, they’re done in a day, and then they hit a wall of cross-team reviews, governance processes, and human-speed decision-making. The result isn’t faster delivery — it’s the same delivery speed with more frustration, because the bottleneck moved and nobody redesigned the process around it. The ThoughtWorks retreat put it bluntly: “You give a team AI tools, they clear their backlog in days and then hit a wall of cross-team dependencies, architecture reviews and human-speed decision-making. The result is not faster delivery. It is the same speed with more frustration.” Practitioners at multiple major tech companies are hitting the same wall. The problem isn’t the tools — it’s that the organizational structures were built for human-speed work and nobody has redesigned them.

This is already playing out in open source. The old unwritten contract — file a bug, wait for a maintainer — is collapsing. Fork it, prompt it, fix it, ship it. The friction that kept open source coherent is gone. The downstream consequence is fragmentation at scale, and governance structures built for human-speed contribution are going to break. I wrote about the mechanics of how this unfolds in Agentic Coding is About to Fracture Open Source.

Agents are ephemeral robots — no motors or sensors, but wired to the real world through APIs and tools. The MCP (Model Context Protocol) and A2A (Agent-to-Agent) specifications are the plumbing that makes agent topology possible — agents that can discover capabilities, delegate work, and compose with other agents. The enterprise architecture layer that needs to be built on top of this plumbing — with identity controls, permission boundaries, a work ledger, and governance paths — barely exists yet. Organizations with strong, well-designed APIs are dramatically better positioned than those without. Everyone else is building on sand.

“We optimized the software delivery process for humans. Now that it’s not just humans, we have to ask what organizing actually means.” — ThoughtWorks Retreat Report, p.7

That question is sitting unanswered inside most companies right now. The engineers who understand it are already thinking about it. Most of the rest of the business isn’t — yet.

4. The Latent Knowledge Problem

Senior engineers carry decades of pattern-matching that never gets documented. They know that a specific error code is a symptom of a deeper infrastructure issue. They know that high CPU on a particular service means checking the database connection pool before anything else. This knowledge lives in people’s heads. It transfers through mentorship, pairing, and incident response — not through documentation.

This is the main prerequisite for self-healing systems, and it’s the one furthest from being solved. The ThoughtWorks retreat called this the “latent knowledge problem” and concluded that before agents can respond to incidents autonomously, organizations need to build an “agent subconscious” — a knowledge graph built from post-mortems, incident data, and operational runbooks that gives agents the historical context that experienced engineers carry in their heads. This was one of the few areas where retreat participants reached clear consensus: most organizations don’t have the culture or processes to capture this knowledge, and self-healing systems will stall here first.

As I wrote about Admiral Rickover and the Nuclear Power Program, critical operational knowledge doesn’t transfer through manuals. It transfers through years of supervised operation, qualification boards, and deliberate knowledge extraction. Rickover understood that the gap between written procedure and operational reality is exactly where accidents happen. Closing that gap for AI agents is going to require the same discipline, and most software organizations have neither the culture nor the processes to do it.

Self-healing systems will stall here. Not because the AI isn’t capable — because the prerequisite knowledge infrastructure doesn’t exist. Code changes should be the last resort in incident remediation. The path runs through better rollback, better feature flags, and better observability first.

“Senior engineers bring decades of pattern-matching to incident response… This knowledge is almost never documented. It lives in people’s heads and gets applied through experience.” The retreat’s bottom line on self-healing: “The ambition is real. The prerequisites are far from met.” — ThoughtWorks Retreat Report, p.8

The organizations that start building the knowledge infrastructure now — automated post-mortems, structured runbooks, incident knowledge graphs — will have a meaningful head start when the self-healing capability matures. Most haven’t started.

5. The Human Side: Who Wins, Who Doesn’t, and Why

The labor market data is already telling the story. Stanford’s Digital Economy Lab found that employment for software developers aged 22-25 declined nearly 20% from its peak. Entry-level hiring at the largest tech firms fell 25%. Meanwhile, Satya Nadella says 30% of Microsoft’s code is written by AI, Sundar Pichai says Google is over 30%, and Meta wants AI handling half their development within 12 months.

In The Three Tiers of Software Development, I laid out where the industry is stratifying:

AI Creators — the researchers and infrastructure engineers building the systems. Small population, extraordinarily compensated, irreplaceable.

AI-Enabled Developers — experienced engineers using AI as a force multiplier. Holding compensation, doing 5-10x more with the same headcount. The catch: companies won’t need as many of them. The Salesforce hiring pause after 30% AI productivity gains is a preview of what’s coming everywhere.

Software Technicians — people generating working code without deep understanding. Facing brutal compensation compression. If a non-programmer can build a functional app in a weekend, the economic justification for $130K to do the same thing doesn’t hold.

The nuance the data misses: junior developers are more valuable than the layoff numbers suggest. AI tools get them past the awkward initial net-negative phase faster. They’re a call option on future productivity. And they’re often better at AI tools than senior engineers, having never developed the habits and assumptions that slow adoption. The ThoughtWorks retreat reached the same conclusion independently, finding that juniors are “more profitable than they have ever been” while the real concern is mid-level engineers who came up during the hiring boom without developing the foundational depth to thrive in the new environment. That’s a striking level of agreement across very different observers.

The historical pattern is clear. In What Would Grace Say?, I traced how Grace Hopper built the first compiler in 1952 and the computing establishment said computers could only do arithmetic — she had working software and nobody would touch it. Every objection turned out to be wrong. The compiler didn’t replace programmers; it made them wildly more productive and freed them to think about problems instead of opcodes. In 1992, engineers hand-coded polygon rendering algorithms; two years later that work was in hardware and the job became animation and lighting. Each time the abstraction layer rose, the engineers who moved up the stack built the next industry. The ones who insisted they were hired to do the low-level work were left behind.

Garry Tan put it simply: “Intelligence is on tap now so agency is even more important.” When intelligence becomes a utility, the ability to act on it becomes everything. The most important professional skill right now isn’t technical — it’s agency: bias toward action, ownership of outcomes, the ability to unblock yourself when information is incomplete. I wrote about what that looks like as a working engineer in AI-Boosted Building — It’s All About Agency.

“AI is not replacing people. It is rearranging what people do and how they feel about doing it… Developer productivity and developer experience are decoupling. Organizations can achieve productivity gains through AI tools even in environments where developers report lower satisfaction, more cognitive load and reduced sense of flow.” — ThoughtWorks Retreat Report, p.9

That last point is the hard one that nobody wants to say out loud. The business case for investing in developer experience is weakening just as the human cost of ignoring it is rising. That tension hasn’t been resolved anywhere yet — and the organizations that figure it out first will have a real talent advantage.

6. The Technical Infrastructure Still Needs to Be Built

Every programming language in existence was designed with humans as the primary user. Dynamic typing reduces cognitive overhead for human programmers. Strong static typing catches human errors. Neither was designed with AI generation in mind.

Languages that make incorrect code unrepresentable — through strong types, restricted computation models, formal constraints — help agents produce correct output and help humans verify it. In an agent-generation world, the tradeoff inverts: what is good for AI turns out to be good for humans too. The ThoughtWorks retreat converged on this exact principle, with practitioners arguing that “what is good for AI is good for humans” when it comes to language design — a conclusion that matches what I’ve been arguing on language choice. LLM code generation quality is now a first-class selection criterion when choosing what language to build in, which is why I added it as one of the five evaluation factors in Programming Languages in 2026.

The more radical possibility: source code as we know it could become a transient artifact, generated on demand and never stored. The skeptical view is that deterministic validation requires a stable artifact to test against, and that artifact is source code regardless of what you call it. But the direction of travel is clear — the code matters less than the spec and the tests.

The semantic layer work — knowledge graphs, domain ontologies, the grounding layer that gives AI agents real understanding of business domains — is where the most interesting architecture work will happen over the next few years. Technologies that failed to gain mainstream adoption for decades are suddenly relevant again. The value is in composable capabilities that agents can discover and combine, not in opinionated workflows that force customers to serve the software. That’s the core of the argument in SaaS is Dead, Long Live Platforms: the companies that will survive are platforms that expose capabilities, not products that prescribe workflows.

“The infrastructure for the agent era doesn’t exist yet. These are the pieces being assembled.” One concrete finding: practitioners building semantic layers at scale reported that a large telecom’s entire domain ontology could be captured in roughly 286 concepts — “that number made the work feel achievable rather than impossibly ambitious.” — ThoughtWorks Retreat Report, p.11

286 concepts to ground an enterprise-scale AI agent in a real business domain. That’s a tractable engineering problem, not a moonshot. The organizations that treat it that way and start now will be the ones whose agents actually understand what they’re working on.

7. Security Isn’t Ready for This

The most vivid example of where agent security stands: give an agent email access and you’ve enabled password resets and account takeovers. Give it full machine access for development work and you’ve given it full machine access for everything it decides to do.

The industry is treating security as something to solve after the technology works. With agents, this sequencing is dangerous. The ThoughtWorks retreat noted with concern that the security session had low attendance — which they called a reflection of a broader industry pattern. Security is being treated as something to solve later. The practitioners who did engage on security were direct: platform engineering should drive secure defaults, making safe behavior easy and unsafe behavior hard. Individual developers cannot be relied on to make security-conscious choices when configuring agent access.

The practical solution isn’t to slow down adoption — it’s to build secure defaults. In From Clicking Yes to Letting Claude Run Wild (Safely), I worked through this problem hands-on: the constant “mother may I” permission dance is tedious, but those guardrails exist for real reasons. VSCode devcontainers solved the problem for my local workflow — the agent gets full permissions inside the container, and the worst outcome is blowing away the container. Platform engineering needs to solve this at organizational scale. They’ll click yes until something bad happens.

Agile isn’t dying — it’s adapting. Teams finding success with AI tools are rediscovering XP practices: pair programming, ensemble development, continuous integration. These create the tight feedback loops and shared understanding that agent-assisted development requires. The real threat isn’t moving too fast. It’s that AI-assisted work makes large changesets easy to produce, and teams drift toward waterfall-like patterns — large, infrequent releases — which directly reverses a decade of DORA research showing that smaller batch sizes correlate with higher stability. Faster tools don’t fix a broken release process; they expose it.

“Platform engineering should drive secure defaults by making safe behavior easy and unsafe behavior hard. Organizations should not rely on individual developers making security-conscious choices when configuring agent access.” Three priorities the retreat identified: security by design as a non-negotiable baseline, cross-industry coalitions for interoperable agent security standards, and AI-enabled defense mechanisms that can match the speed and sophistication of AI-enabled attacks. — ThoughtWorks Retreat Report, p.13

Low attendance at a security session among senior engineering practitioners from major tech companies. That detail alone tells you where the industry is. The people who should be most alarmed about agent security are the same ones not showing up to talk about it. That will change — probably after something bad happens at scale.

8. Swarms, Patrols, and the Work That Doesn’t Get Glamorous

Agent swarming — multiple agents working in parallel toward a common goal — gets most of the attention. But the ThoughtWorks retreat made an observation I find more practically useful: for most enterprise use cases, agent orchestration won’t look like a swarm. It’ll look like patrol workers on loops — agents running well-defined ETL transforms, data quality checks, and business process monitors on continuous cycles. The unglamorous work of data reliability and cleanliness, running always-on in the background. That framing matches exactly what I’ve been seeing in practice.

The mental model shift required for parallel agent work is real. Engineers trained in sequential decomposition struggle to conceptualize it — I described the specific patterns that work in Tooling Is the New Model. The breakthrough comes from doing, not theorizing: ask an agent to parallelize work explicitly and observe what happens. The learning is in the observation. Organizations with strong, well-designed APIs are significantly better positioned for both swarming and patrol-style deployment than those without, which is as good a reason as any to get your API house in order now. I covered the practical mechanics of running parallel agent workloads in Get Stuff Done.

“The first barrier to effective swarming is mental, not technical. Engineers trained in sequential decomposition struggle to conceptualize parallel agent work. This mental model actively blocks learning. Practitioners who have made breakthroughs in swarming describe the experience as fundamentally unlike anything they have encountered in previous software development.” — ThoughtWorks Retreat Report, p.14

The only way to develop the mental model is to do the work. As I wrote in AI Agent Naysayers: “You cannot learn to swim without getting in the water. It’s the same with AI. You need to get wet and try it out.” No amount of reading about parallel agent orchestration prepares you for the first time you actually run it. In AI Agent Naysayers I wrote about exactly this dynamic — the skeptics aren’t wrong because they’re stupid, they’re wrong because they haven’t gotten in the water yet. Every single objection dissolves the first time you actually watch a swarm of agents work a problem in parallel. Start small, observe the results, adjust. The practitioners who are ahead on this didn’t read their way there. They built something!

What Comes Next

The open questions aren’t technical. They’re human.

How do you help engineers who love writing code find meaning in supervisory work? The builder compulsion doesn’t go away when the abstraction layer rises — it redirects. The satisfaction of making something remains; only the tools change. In Builders Build, I argued that this instinct is actually the best asset an engineer can have right now: the people who can’t help building things are the ones who will figure out how to build with new tools fastest.

How do you govern organizations where agents move faster than humans can decide? The approval processes and compliance gates built for human-speed development are already becoming the primary bottleneck. Fixing this isn’t a technical problem — it’s organizational redesign, and it requires involving governance and audit functions early rather than treating them as obstacles to navigate around.

How do you build trust in systems that are inherently non-deterministic? Code review as practiced — every line manually inspected — is already breaking under the volume. The replacement is risk tiering: verification investment matched to blast radius. Not every line gets human review. The lines that touch payment processing, authentication, or safety-critical paths get extensive verification. The lines that generate a UI label don’t. This moves engineering from a craft model to a risk management model, and most organizations aren’t there yet.

These aren’t just engineering questions. On May 25, Pope Leo XIV released his first encyclical, Magnifica Humanitas (“Magnificent Humanity”), and the human concerns he raised map directly onto what the ThoughtWorks retreat and I have been wrestling with. He warned that rapid automation could leave workers in “forced inactivity” — stripped of the work that gives life structure and meaning. He cautioned that without stronger safeguards, AI could weaken human agency and shift critical decisions out of human hands. And he was direct that irreversible, high-stakes decisions must remain with humans, not machines. He was not anti-technology — he wrote that “technology should not be considered, in itself, as a force antagonistic to humanity” — but insisted it must be guided toward the common good rather than allowed to concentrate power in few hands at the expense of everyone else.

That’s not a theological argument. That’s a description of the same fault lines the engineering community is navigating. The engineers who lose their identity to automation and drift into “forced inactivity.” The governance structures that need to keep consequential decisions in human hands. The risk-tiering judgment about which systems require human oversight and which don’t. Engineers don’t usually take their cues from papal encyclicals, but when the same concerns surface independently in a ThoughtWorks retreat, in the labor market data, and in a document written for a global audience of billions — that convergence means something.

The field is being rebuilt. The practitioners who understand where it’s going have a significant advantage over those waiting for certainty before they move. Certainty won’t arrive before the window closes.