The New Developer Workflow: Shipping with Codex, Issues, and LLM-First Execution
2026-03-18
The developer workflow has changed fast.
A year ago, most of us were still thinking in the old loop:
- open ticket
- read code
- write code manually
- test
- open PR
That still exists.
But now there is a new layer in the middle:
the developer is increasingly managing an LLM-powered execution loop.
That means tools like Codex are not replacing developers. They are changing what good developers actually spend their time doing.
The highest leverage work is no longer just typing faster.
It is:
- defining the problem clearly
- giving the right context
- constraining the scope
- asking for the right change
- reviewing the output properly
- turning successful fixes into repeatable workflow
In other words: the job is becoming more architectural, more editorial, and more operational.
The new workflow starts before any code is written
The biggest mistake people make with coding LLMs is treating them like autocomplete with opinions.
That is not where the real gains come from.
The new workflow starts with issue design.
A strong developer now writes an issue in a way that an LLM, an engineer, or a future teammate could all act on with minimal ambiguity.
A weak issue says:
Fix the broken checkout bug.
A strong issue says:
On mobile, the checkout submit button remains disabled after address autofill. Reproduce on Safari iOS width 390px. Expected behaviour: button enables once all required fields are populated. Suspected area: checkout form validation state and autofill event handling. Add a regression test.
That difference matters because LLM tools perform much better when the task has:
- a clear outcome
- a known reproduction path
- a suspected area of the codebase
- constraints
- a definition of done
This is why issue writing is becoming a first-class engineering skill again.
Not because project management got fashionable, but because better task definition now directly improves implementation quality.
What developers do now, start to finish
Here is the practical end-to-end workflow that is becoming normal for AI-assisted engineering.
1. Reproduce and frame the problem
Before handing anything to a tool, the developer works out:
- what is actually broken
- how to reproduce it
- whether it is a bug, missing feature, or unclear requirement
- what "fixed" really means
This part is still human-heavy.
If you give an LLM a vague complaint, you get a vague patch.
If you give it a properly framed engineering problem, you often get a useful first pass much faster than doing everything manually.
2. Write the issue for machine execution, not just human reading
This is one of the biggest shifts.
An issue used to be mostly a communication tool between people.
Now it is often also an execution brief for a coding agent.
A good LLM-ready issue usually includes:
- the bug or feature in one sentence
- why it matters
- exact steps to reproduce
- the expected behaviour
- any logs, screenshots, or failing tests
- the files or subsystems likely involved
- constraints like "do not change API shape" or "preserve existing UI"
- verification steps
That issue is no longer admin overhead.
It is prompt engineering attached to the real codebase.
3. Let the tool explore the repository
This is where tools like Codex change the speed of the workflow.
Instead of you manually opening twenty files first, the agent can:
- inspect the repo structure
- find relevant templates, routes, handlers, or components
- read existing tests
- identify similar patterns already used elsewhere
- propose where the change should live
That does not remove the need for engineering judgement.
It just compresses the discovery phase.
The developer's role becomes:
- set direction
- validate assumptions
- stop the tool drifting into the wrong area
This feels much closer to directing a capable junior-to-mid engineer than using a code generator.
4. Ask for a scoped implementation
The best results usually come from giving the tool a bounded task.
For example:
- fix this bug and add one regression test
- create the blog post file but do not redesign the template
- refactor this parser without changing output shape
- trace where this value becomes null and patch the narrowest layer possible
Scope matters because LLMs are excellent at local reasoning and much weaker when a task quietly sprawls across architecture, product, and business logic all at once.
So the modern developer learns to break work into well-contained units.
That is not a limitation.
It is good engineering anyway.
5. Review the diff like a real code reviewer
This is where people get lazy, and where bugs get shipped.
You cannot outsource judgement.
Even if Codex writes 80% of the patch, the developer still needs to review:
- did it actually solve the reported problem
- did it change anything unrelated
- did it misunderstand the architecture
- did it add brittle logic
- did it duplicate existing helpers
- did it skip tests
- did it introduce hidden regressions
The review stage matters even more with LLM-generated code because the output is often plausible-looking.
Plausible-looking is not the same as correct.
That means code review is becoming more important, not less.
6. Run the tests, then test the behaviour
Another mistake is assuming green tests mean the change is done.
In AI-assisted workflows, you usually want both:
- automated verification
- behaviour verification
The tool might update a unit test and still miss the user-facing bug.
Or it might patch the UI while forgetting an edge case in the backend.
The best workflow is:
- run the relevant tests
- reproduce the original issue again
- confirm the fix in the real interface or endpoint
- look for nearby regressions
The human stays responsible for reality.
7. Tighten the issue if the first attempt is wrong
This is another big workflow change.
When the first patch misses, the answer is often not "LLMs are useless".
Usually the answer is:
- the issue was underspecified
- the context was incomplete
- the reproduction path was weak
- the scope was too broad
- the verification criteria were unclear
So developers increasingly work in a loop:
issue -> attempt -> review -> sharper issue -> better attempt
That is a very different muscle from pure hand-coding.
It rewards clarity of thought.
8. Merge the fix and capture the pattern
Once the task works, the smart move is not just to merge and forget it.
The new workflow also includes harvesting reusable patterns:
- a stronger issue template
- a better prompt format
- a new regression test pattern
- a checklist for future bugs in the same area
- a team convention for how agents should be used
This is how teams compound gains from LLM tooling.
Not by treating every task as magic, but by building repeatable operating rules.
What a good LLM issue looks like
If I were writing an issue specifically for an LLM-enabled workflow, I would want it to look something like this:
Title:
Improve newsletter signup validation on mobile Safari
Problem:
Users can enter a valid email, but the submit button does not enable after iOS autofill.
Impact:
This blocks conversions on mobile and creates a false validation failure.
Steps to reproduce:
- Open signup page on Safari iPhone width.
- Use email autofill.
- Observe submit button remains disabled.
Expected behaviour:
The button should enable as soon as the email field contains a valid value.
Likely area:
- form validation hook
- input change handling
- autofill event support
Constraints:
- do not redesign the component
- keep current API contracts
- add a regression test if test coverage exists in this area
Verification:
- existing tests pass
- new regression test passes
- manual check confirms autofill enables submit button
That is not overkill anymore.
That is the input format for high-quality AI-assisted engineering.
What tools like Codex are really doing
The hype version says:
"the AI writes the code now"
The real version is more interesting.
Tools like Codex help developers:
- inspect unfamiliar codebases faster
- move from issue to first patch faster
- draft tests and implementation together
- explain diffs and assumptions
- reduce the boring mechanical parts of coding
But the human still provides:
- product judgement
- architecture judgement
- risk judgement
- prioritisation
- acceptance criteria
So the workflow is not "developer versus AI".
It is:
developer as operator, reviewer, and system designer
That is a meaningful shift in the craft.
The developers who benefit most
The people getting the most value out of this new workflow are usually not the ones writing the fanciest prompts.
They are the ones who:
- understand the codebase
- can isolate a bug properly
- know how to constrain scope
- can recognise bad abstractions quickly
- insist on verification
In short: strong engineers get amplified the most.
That is why I do not think these tools remove the need for real engineering.
I think they raise the premium on it.
Final thought
The old workflow centred on writing every line yourself.
The new workflow centres on guiding a system from problem definition to tested outcome.
That includes:
- writing better issues
- giving better context
- steering tools like Codex
- reviewing diffs carefully
- verifying behaviour in reality
- turning wins into repeatable team process
That is the whole start-to-finish shift.
And honestly, it is one of the most interesting changes software development has had in years.
/next step
Working on something similar?
If you are building AI tooling, automation, internal systems, or trying to untangle a delivery problem, I am always open to a thoughtful conversation.