Empathy Engine

Empathy Engine

The Sprint Stayed Green, The Budget Didn't

Profit With Proof | Episode 1

Mark S. Carroll's avatar
Mark S. Carroll
Mar 17, 2026
∙ Paid

Previous:

The Rework Tax

How ambiguity at the start of a sprint becomes a margin problem at the end of a quarter

The sprint closed. The feature shipped. The budget did not agree.


Research Binder: the receipts (citations + source notes) are compiled in a PDF at the bottom of this article.


Here is how to price what your metrics are hiding.

A product manager writes the story in forty-five minutes because the planning meeting is already running hot and nobody wants to be the person who slows the room down. The acceptance criteria look finished enough to survive the scroll test. Engineering reads them one way. QA reads them another. Nobody is confused enough to stop the sprint.

That is the dangerous part. Everybody sees just enough clarity to keep moving.

By Friday, the ticket is closed. By Monday, a follow-on ticket is open. Nobody logs it as rework. Nobody calls it duplicate labor. Nobody says out loud that the same feature is now billing the organization twice. The sprint velocity stays green. The budget does not.

Most teams treat this as a collaboration annoyance. A miscommunication. A little friction between functions. That reading is too small. What just happened was not a failure to align. It was hidden cost entering the system through a door your metrics do not watch.

That is the rework tax.

Empathy Engine is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

We love calling it “done” on Friday and “unexpected complexity” on Monday, as if the invoice cannot read

The measurement trap

The first trap is not incompetence. It is measurement. Story points measure closure. Velocity measures throughput. Neither tells you whether the work was right the first time. That sounds obvious when you say it slowly. It becomes much less obvious once a dashboard goes green and the meeting ends.

The system has already decided what counts as progress, and what does not.

A green dashboard is operationally comforting because it gives everyone something clean to point at. Tickets moved. Commitments held. Sprint intact. The board can breathe. Leadership can move on. The team can say it shipped. Nobody has to ask the more expensive question, which is whether the shipped work matched the intended work closely enough to avoid paying for it again.

That omitted question is where the tax lives.

Nothing says operational excellence like a glowing dashboard sitting directly above a screaming underground crime scene

What the dashboard reports is closure. What the business still pays for is everything closure failed to prevent: hidden spend, follow-on tickets, reopened work, quiet capacity loss, engineers fixing the same shape twice under a new label. The dashboard is not lying. It is omitting the bill.

And that omission matters, because the reopen is not where the cost starts. The reopen is where the cost finally becomes visible.


Same feature. Second bill.

One story with two valid readings creates two operational realities. One team builds what it believes was requested. Another verifies against what it expected to receive. Both teams can be competent. Both can act in good faith. That does not lower the invoice. It increases it.

The first build still consumed planning time, engineering time, review time, and QA time. Once the story reopens, the organization is not buying new value. It is paying again for value it thought it already purchased.

Same feature. Second bill.

A masterpiece in modern budgeting. Buy one feature, fund it twice, and call the second payment alignment

That is why rework feels so corrosive. It is not only expensive. It is insulting. Teams are not just asked to do more work. They are asked to repeat work the system already counted as done. The emotional damage is real. But the financial damage is the part leaders keep missing. Second-pass labor is still labor. Delayed value is still delayed value. A closed story that comes back wearing a different name is still the same money leak.


Where the pile-up actually comes from

If rework were evenly distributed across all causes, the answer would be a broad call for better discipline everywhere. That is not what the evidence suggests. The pattern is more concentrated than that.

Research and industry benchmark data consistently show that requirements-related defects own an outsized share of total rework cost. A peer-reviewed SEI technical report on DoD acquisition systems found that requirements-related rework accounted for 79% of total rework cost, with design defects accounting for 16% and other sources for the remainder. Separately, industry-expert benchmark data from Capers Jones, cited across multiple peer-reviewed SEI and IEEE publications, estimates that typical organizations spend 40 to 50% of total project effort reworking defects in requirements, design, and code. Immature organizations may reach 60 to 80%.

Turns out “random software chaos” is not that random. Requirements are not just in the lead. They are lapping the field

That is what changes the argument. Requirements discipline is not process theater. It is an economic choke point. Design still matters. Other sources still exist. But when one source bucket is this lopsided, the response cannot stay generic.

Share


How delay multiplies the bill

The problem gets worse because delay changes the economics. A miss caught early is still annoying. A miss caught late is punitive. The same misunderstanding moves through different cost states as it survives longer.

Classic peer-reviewed research across multiple studies, including Boehm (1988) and McConnell (2001), places the cost multiplier for fixing a requirements defect at production versus catching it in the requirements phase at 10 to 200 times. The IBM Systems Sciences Institute placed it at 15 times. The SEI SAVI analysis independently confirmed a range of one to two orders of magnitude. The specific multiplier varies by organization and defect type, but the direction is consistent across decades of evidence: what felt like “we can fix it later” at planning time becomes a different class of expense by testing, and another one again by production.

The defect did not merely survive. It appreciated.

The defect did not get more interesting with age. It just got more expensive and more eager to ruin somebody’s quarter

“Fix it later” sounds like sequencing. In practice, it is borrowing against future focus, future budget, and future credibility. By the time a requirements miss travels through testing, release, and production support, the cost is no longer just technical. It becomes coordination drag, support burden, schedule disruption, and management attention nobody budgeted to spend.

Delay does not just extend the problem. It multiplies the bill.

At small scale, a reopened story looks like a team problem. At actual scale, it belongs to a pattern that operates across the industry. Industry-expert benchmark data from the Consortium for Information and Software Quality (CISQ 2022 Cost of Poor Software Quality report, US-only, aggregated from 88 public sources) estimates the total US cost of poor software quality at $2.41 trillion, with finding and fixing defects accounting for $479 billion of that figure. Your Monday morning reopen is one data point in that pattern. One hidden ticket does not look like a national problem. Thousands of those tickets, spread across organizations that measure delivery better than correctness, absolutely do.


When the vault is on fire, maybe stop calling rework a “team-level process issue” and start calling it money

Why small upstream gains are worth more than they look

The good news is not that the problem is small. It is not. The good news is that small upstream gains can be worth more than they look. A peer-reviewed SEI study of DoD aerospace systems (CMU/SEI-2018-TR-002) found that each 1% increase in requirements defect removal efficiency saved $22 million for a 27-million-source-line-of-code system, with savings scaling linearly with system size and total rework percentage. That is a specific finding from a specific context, and you should calibrate to your own system size and baseline. But the directional principle it encodes is robust across decades of evidence: catching requirements problems before they travel downstream recovers real money, not promised money.

Not magical money. Not guaranteed ROI. Real avoided rework that never had to be funded in the first place. The appropriate posture for any claim in this territory is transparency and defensibility, not guaranteed outcomes. That is exactly the posture this article takes.

Funny how a tiny improvement suddenly looks huge once the dollars stop being theoretical and start wearing a suit
User's avatar

Continue reading this post for free, courtesy of Mark S. Carroll.

Or purchase a paid subscription.
© 2026 Mark S. Carroll · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture