The New Developer Workflow: Shipping with Codex, Issues, and LLM-First Execution

2026-03-18

The developer workflow has changed fast.

A year ago, most of us were still thinking in the old loop:

open ticket
read code
write code manually
test
open PR

That still exists.

But now there is a new layer in the middle:

the developer is increasingly managing an LLM-powered execution loop.

That means tools like Codex are not replacing developers. They are changing what good developers actually spend their time doing.

The highest leverage work is no longer just typing faster.

It is:

defining the problem clearly
giving the right context
constraining the scope
asking for the right change
reviewing the output properly
turning successful fixes into repeatable workflow

In other words: the job is becoming more architectural, more editorial, and more operational.

The new workflow starts before any code is written

The biggest mistake people make with coding LLMs is treating them like autocomplete with opinions.

That is not where the real gains come from.

The new workflow starts with issue design.

A strong developer now writes an issue in a way that an LLM, an engineer, or a future teammate could all act on with minimal ambiguity.

A weak issue says:

Fix the broken checkout bug.

A strong issue says:

On mobile, the checkout submit button remains disabled after address autofill. Reproduce on Safari iOS width 390px. Expected behaviour: button enables once all required fields are populated. Suspected area: checkout form validation state and autofill event handling. Add a regression test.

That difference matters because LLM tools perform much better when the task has:

a clear outcome
a known reproduction path
a suspected area of the codebase
constraints
a definition of done

This is why issue writing is becoming a first-class engineering skill again.

Not because project management got fashionable, but because better task definition now directly improves implementation quality.

What developers do now, start to finish

Here is the practical end-to-end workflow that is becoming normal for AI-assisted engineering.

1. Reproduce and frame the problem

Before handing anything to a tool, the developer works out:

what is actually broken
how to reproduce it
whether it is a bug, missing feature, or unclear requirement
what "fixed" really means

This part is still human-heavy.

If you give an LLM a vague complaint, you get a vague patch.

If you give it a properly framed engineering problem, you often get a useful first pass much faster than doing everything manually.

2. Write the issue for machine execution, not just human reading

This is one of the biggest shifts.

An issue used to be mostly a communication tool between people.

Now it is often also an execution brief for a coding agent.

A good LLM-ready issue usually includes:

the bug or feature in one sentence
why it matters
exact steps to reproduce
the expected behaviour
any logs, screenshots, or failing tests
the files or subsystems likely involved
constraints like "do not change API shape" or "preserve existing UI"
verification steps

That issue is no longer admin overhead.

It is prompt engineering attached to the real codebase.

3. Let the tool explore the repository

This is where tools like Codex change the speed of the workflow.

Instead of you manually opening twenty files first, the agent can:

inspect the repo structure
find relevant templates, routes, handlers, or components
read existing tests
identify similar patterns already used elsewhere
propose where the change should live

That does not remove the need for engineering judgement.

It just compresses the discovery phase.

The developer's role becomes:

set direction
validate assumptions
stop the tool drifting into the wrong area

This feels much closer to directing a capable junior-to-mid engineer than using a code generator.

4. Ask for a scoped implementation

The best results usually come from giving the tool a bounded task.

For example:

fix this bug and add one regression test
create the blog post file but do not redesign the template
refactor this parser without changing output shape
trace where this value becomes null and patch the narrowest layer possible

Scope matters because LLMs are excellent at local reasoning and much weaker when a task quietly sprawls across architecture, product, and business logic all at once.

So the modern developer learns to break work into well-contained units.

That is not a limitation.

It is good engineering anyway.

5. Review the diff like a real code reviewer

This is where people get lazy, and where bugs get shipped.

You cannot outsource judgement.

Even if Codex writes 80% of the patch, the developer still needs to review:

did it actually solve the reported problem
did it change anything unrelated
did it misunderstand the architecture
did it add brittle logic
did it duplicate existing helpers
did it skip tests
did it introduce hidden regressions

The review stage matters even more with LLM-generated code because the output is often plausible-looking.

Plausible-looking is not the same as correct.

That means code review is becoming more important, not less.

6. Run the tests, then test the behaviour

Another mistake is assuming green tests mean the change is done.

In AI-assisted workflows, you usually want both:

automated verification
behaviour verification

The tool might update a unit test and still miss the user-facing bug.

Or it might patch the UI while forgetting an edge case in the backend.

The best workflow is:

run the relevant tests
reproduce the original issue again
confirm the fix in the real interface or endpoint
look for nearby regressions

The human stays responsible for reality.

7. Tighten the issue if the first attempt is wrong

This is another big workflow change.

When the first patch misses, the answer is often not "LLMs are useless".

Usually the answer is:

the issue was underspecified
the context was incomplete
the reproduction path was weak
the scope was too broad
the verification criteria were unclear

So developers increasingly work in a loop:

issue -> attempt -> review -> sharper issue -> better attempt

That is a very different muscle from pure hand-coding.

It rewards clarity of thought.

8. Merge the fix and capture the pattern

Once the task works, the smart move is not just to merge and forget it.

The new workflow also includes harvesting reusable patterns:

a stronger issue template
a better prompt format
a new regression test pattern
a checklist for future bugs in the same area
a team convention for how agents should be used

This is how teams compound gains from LLM tooling.

Not by treating every task as magic, but by building repeatable operating rules.

What a good LLM issue looks like

If I were writing an issue specifically for an LLM-enabled workflow, I would want it to look something like this:

Title:

Improve newsletter signup validation on mobile Safari

Problem:

Users can enter a valid email, but the submit button does not enable after iOS autofill.

Impact:

This blocks conversions on mobile and creates a false validation failure.

Steps to reproduce:

Open signup page on Safari iPhone width.
Use email autofill.
Observe submit button remains disabled.

Expected behaviour:

The button should enable as soon as the email field contains a valid value.

Likely area:

form validation hook
input change handling
autofill event support

Constraints:

do not redesign the component
keep current API contracts
add a regression test if test coverage exists in this area

Verification:

existing tests pass
new regression test passes
manual check confirms autofill enables submit button

That is not overkill anymore.

That is the input format for high-quality AI-assisted engineering.

What tools like Codex are really doing

The hype version says:

"the AI writes the code now"

The real version is more interesting.

Tools like Codex help developers:

inspect unfamiliar codebases faster
move from issue to first patch faster
draft tests and implementation together
explain diffs and assumptions
reduce the boring mechanical parts of coding

But the human still provides:

product judgement
architecture judgement
risk judgement
prioritisation
acceptance criteria

So the workflow is not "developer versus AI".

It is:

developer as operator, reviewer, and system designer

That is a meaningful shift in the craft.

The developers who benefit most

The people getting the most value out of this new workflow are usually not the ones writing the fanciest prompts.

They are the ones who:

understand the codebase
can isolate a bug properly
know how to constrain scope
can recognise bad abstractions quickly
insist on verification

In short: strong engineers get amplified the most.

That is why I do not think these tools remove the need for real engineering.

I think they raise the premium on it.

Final thought

The old workflow centred on writing every line yourself.

The new workflow centres on guiding a system from problem definition to tested outcome.

That includes:

writing better issues
giving better context
steering tools like Codex
reviewing diffs carefully
verifying behaviour in reality
turning wins into repeatable team process

That is the whole start-to-finish shift.

And honestly, it is one of the most interesting changes software development has had in years.

/next step

Working on something similar?

If you are building AI tooling, automation, internal systems, or trying to untangle a delivery problem, I am always open to a thoughtful conversation.

Start a Conversation

← Back to Blog