The Rise of Agentic Programming — and Where It Starts to Struggle

Over the past few months, something shifted in software development.

Not gradually. Not in the “we’ll adapt over time” sense. This happened fast, compressed into a single quarter.

At OrangeLoops, we’ve been investing in GenAI for the past two years. This wasn’t new for us, and still, the changes over the last few months felt sudden.

Vibe-Coding: From “Interesting” to Operational

For a while, vibe-coding was parked in this awkward place of interesting and risky, but dismissed by professionals for “real system development”.

That changed recently.

With December came a wave of better models, with larger context windows, and improved performance on multi-step tasks that pushed these systems into a different category. Things that required careful decomposition a few months ago can now be one-shotted and it works frequently enough for you to trust them.

What was fringe at the beginning of 2025 has become mainstream. Coding agents such as Claude Code and CODEX went from experimental to tools software developers need to have in 2026.

Not for everything. Not always. But enough to matter.

Doing our own research

In February, we decided to test this in a way that would give us a clear signal.

We took DrillRoom, one of our products and tried to port the iOS app to Android using an agent-driven workflow.

– ~90k lines of Swift
– ~670 files
– Real system, doing real time event detection and running a pipeline of ML models on the edge, not a demo

Normally, this is a multi-month effort for a small team.

We approached it differently:
– One developer
– CLI-based workflow
– Agent orchestration (mainly Claude Code, plus some CODEX)
– No Android IDE
– No manual Kotlin work

Two weeks in:

– ~40k lines of Kotlin
– Most of the UI ported
– A large portion of the logic in place
– Running on devices, entering testing

So yes, this works. At least for this class of porting problem, where the code base of the legacy system works as the spec for what needs to be built.

But that’s not the whole story.

Where Things Start to Break

The gains are real, but they’re not linear. Agentic programming compresses the cost of building systems, but it does not compress the cost of understanding or evolving them. And that gap becomes visible as systems grow.

What we’re seeing is that productivity improves dramatically in greenfield scenarios, in the early stages of a system, and then starts to degrade as complexity increases.

In our experience so far, once you move into more complex systems, somewhere around the 100k LOC range, the dynamics change:

– Iterations become less predictable
– Small changes break unrelated parts
– Progress slows down

Not because you can’t generate more code, but because you stop having a clean mental model of the system, and the agents don’t have it either.

This shows up when new systems reach a certain size, when trying to evolve existing ones, but also when tackling complex features, or when troubleshooting involves a mix of software and hardware. The AI will tend to attempt to fix the issue by making changes on the software side of things, even when the issue is on the hardware side. There are still clear gaps in judgment in certain scenarios.

It’s likely that we don’t yet have the right tools and methodologies to tackle larger systems, but what feels magical initially starts to drag and get loaded with frustration as the system grows more complex.

What Didn’t Change

If anything, this shift makes some old truths more visible.

There are still no silver bullets.

The hardest parts of software were never just about writing code. They’re about:

– Getting the specification right
– Ensuring the system behaves as intended
– Keeping it coherent as it evolves

Now that generating code is cheap, mistakes in those areas propagate faster.

Which makes things like:

– clear specs
– solid test harnesses
– strong validation loops

go from “good practice” to baseline requirements.

There’s also a growing gap between what the system does and how well we understand it, what’s often described as cognitive debt.

Agents don’t remove that. If anything, they can increase it.

Adapting While Moving

From a delivery perspective, this creates tension.

Some of our existing tools and habits don’t fit as well:

– Estimating effort the way we used to
– Structuring work in fixed cycles
– Thinking in strictly defined and rigid roles

At the same time, new patterns are emerging, though not fully formed:

– Early movement toward spec-driven workflows
– More investment in evaluation and validation conditions
– More emphasis on defining intent clearly
– Disposable prototyping gaining strength

At OrangeLoops, we’re actively working through this, experimenting, adjusting, and integrating these approaches into real delivery. Testing Spec-Driven development frameworks, coding agent platforms, and even with some developments of our own which I hope to be sharing soon.

Not everything works. Some things do. The constraints didn’t disappear. They are shifting.

It’s a shift in the nature of the workflow of developing software. What used to be a craft of assembling building blocks is getting closer to the alchemical nature of training machine learning models. We are throwing dices here, but that’s fora a different blog post.

It was the best of times, it was the worst of times…

So we find ourselves in this period of time, where the rules of what were do not always hold anymore, while the full scope of what will potentially be is still not clear either.

In this context, invest in having your systems documented, and working in specifications of what you want to build. Word is specifications are the new source code.

Be skeptical of anyone that tells you you don’t need humans any more for building software (specially if they are trying to get you to spend on tokens). For reasonably complex systems that’s still not the case. Be also skeptical of anyone that tells you nothing has really changed.

Be ready to change your workflow every other week as new tools, methodologies and best practices are shared by the community every day.

Don’t assume the AI models will stagnate or plateau. Don’t bet against AI. Try to architect things so that systems and processes will only improve with better models.

At least, that’s what we are doing.

The Rise of Agentic Programming — and Where It Starts to Struggle

Vibe-Coding: From “Interesting” to Operational

Doing our own research

Where Things Start to Break

What Didn’t Change

Adapting While Moving

It was the best of times, it was the worst of times…

Leave a Reply Cancel Reply

ElevenLabs Voice AI Agents: Pros, Limits & When to Use LangGraph

Like us on Facebook

Follow us on Twitter

Recents posts