Ideas

AI Coding: Implementation Is No Longer the Bottleneck

TL;DR

AI has dramatically reduced the time and cost of software implementation. For business leaders, the new bottleneck is deciding what should be built, setting strong product direction, and validating that the work creates business value. The right engineering leadership makes that possible by giving AI clear constraints, focusing review where it matters most, and keeping product outcomes at the center of the process.

~~~

LLM and agent-based capabilities have been progressing rapidly and our approach to software development has continued to evolve accordingly. Our IDE (Integrated Development Environment) of choice, Cursor, has done an impressive job enabling their agents to understand your codebase and plan their implementation with accuracy and sophistication.  What they have accomplished looks a lot like an automated version of our team’s previous attempts to manually bring detailed designs and context to our coding prompts. What was tedious and difficult to parcel out appropriately in our prior efforts is now baked into Cursor and it’s pretty amazing.

With these changes, our coding process has been sped up dramatically.  However, the biggest change is that implementation is no longer the primary bottleneck in product delivery.  Well-understood and specified features can be built extremely quickly.  As a result, engineering value has shifted towards architecture, validation, product understanding, and the operational realities of running reliable production systems.

So what does our work process look like at Flower Press Interactive?

For starters, we are now deeply utilizing AI agents in our development processes. However, we are not treating AI as an autonomous replacement for the whole engineering discipline.  We have thoughtfully incorporated these powerful AI agents into our workflows, while keeping human engineers involved in critical places to ensure the accuracy and reliability of our output.

First, we start out with a solid product specification for the work ahead.  Using “Plan Mode” in Cursor, we turn our specs into an execution plan that can be reviewed. Our AI development environment has graduated from being a pair programming partner to something closer to a reasonably convincing simulacrum of a mid-level engineer. Given clear feature goals, architectural constraints, and a well-defined plan, it can independently execute substantial amounts of implementation work. However, it is still nowhere near as autonomous or reliable as a strong engineer, especially when requirements become ambiguous or implicit.

Once we have a solid plan, we let the AI agents autonomously execute.   After implementation, we review the AI summary of what was done, do a highly targeted review of the code, and then go directly to functional validation.  We are also using a second engineer for PR review, again with a bias towards highly targeted code review and functional validation.

You’ll notice I mention “highly targeted” code reviews. At this point, we focus our review effort on the highest-leverage areas: making sure the agents stay aligned with the project architecture, validating algorithmically complex logic, and carefully reviewing third-party integrations.  We’ve found AI to be exceptionally good at handling large amounts of routine implementation work.  Trusting the agents in these spaces has allowed our engineers to spend more of their time applying judgment where it matters most. The result has been significantly faster development cycles without sacrificing reliability or maintainability.

AI has also changed how we access and maintain project-specific implementation knowledge.  In the past, we had at least one or two engineers who could remember everything about our active codebases.  Now we are mostly relying on the AI agents to maintain that kind of access to details.  This is turning out to be a significant improvement over the older approach; now, everyone can access any of these details without having to consult the author of the original iteration.

We’re also seeing the value of deep framework-specific expertise compress significantly. Experienced engineers can become productive in unfamiliar stacks extraordinarily quickly because the AI handles much of the implementation-level translation between intent and framework mechanics. The leverage is increasingly coming from product understanding, architectural judgment, debugging ability, and systems thinking rather than memorization of framework-specific details.

So, what makes this shift actually workable in practice?

  1. Strong architecture.  We have a VERY opinionated architectural baseline that defines the project layout and frameworks within which the agents execute.  We have found that agents tend to produce code that fits within the architecture that’s already in place.
  2. AI assisted debugging. When we uncover issues, we can use the IDE to help us find the source.  This has turned out to be surprisingly effective.
  3. Product-oriented validation. Our engineers are forced to have a deeper product understanding and validate their code against functionality vs. “code correctness.” This actually eliminates a far more common source of errors, which is, a lack of human understanding, than subtle issues in the code generally produce.

Unfortunately, this approach doesn’t apply to all types of problems that can arise from AI agent-generated code.  Every feature has a class of requirements that are usually implicit: security, scalability, and performance.  Implicit requirements are especially difficult when coding with AI agents. Obviously, the agents aren’t going to be good at doing what they weren’t asked to do, BUT it’s also unrealistic to expect engineers to explicitly state all those requirements.  Often, because the actual requirement is for the code to NOT display some class of behavior.

For security and scalability, we’ve found this to be a problem solved for AI agents the same way it was solved for most software engineers prior to the move to AI agent coding.  Have an effective, well-defined architecture that uses popular, modern frameworks and keep your libraries up to date.  Use a well-known authentication library and a modern ORM.  For most of our applications, modern frameworks and well-understood architectural patterns eliminate the majority of common security and scalability failure modes.  At some point, if you’re lucky, you’re still going to run into scalability issues as your user base grows; at that point, you’ll need to update your architecture. This isn’t something to fear, as the refactor will be exceedingly fast for a reasonably competent senior engineer.

For performance, we are relying on experience to identify likely hotspots preemptively, or we simply deal with performance issues as they crop up.  AI has accelerated the diagnosis and remediation of common operational performance issues.  Recently, we encountered an issue where new entries were being written to the session table for every page view from an unverified user.  The session table was growing uncontrollably, and it lacked an index for the main access pattern during login.  AI made it possible to quickly identify the underlying cause of the performance issue, and it also identified the required express parameter change immediately.

We have not gone all the way to fully autonomous development teams with minimal human oversight. At this point, we’re not convinced that additional automation would create much additional value for us.

Development is no longer the bottleneck in our process. We can implement features much faster than we can agree on what should actually be built. The scarce resource is no longer coding capacity.  Well-designed, well-thought-out products, clear architectural direction, and well-specified functionality are now the limiting resources 

AI has dramatically accelerated implementation, but it has also increased the importance of engineering judgment.

AIDesignEngineering

More About This Post +