Empathy Engine

Empathy Engine

What Happens When Velocity Looks Healthy but the Work Does Not Stay Done?

A green dashboard can still hide the Rework Shadow (Route Rebuilder | Episode 4)

Mark S. Carroll's avatar
Mark S. Carroll
Jun 17, 2026
∙ Paid

👋 Welcome to this week’s edition of Empathy Engine. Every Wednesday, I publish a new article for paid subscribers first, then unlock the full piece for everyone late Thursday morning. Each week, I turn product leadership friction into practical tools, sharper language, and more defensible decisions.

Empathy Engine is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

Research Binder: the receipts (citations + source notes) are compiled in a PDF at the bottom of this post.


If You’re Skimming

  • A green velocity dashboard usually measures closure, not durability. It can look healthy while last sprint’s work quietly comes back.

  • This is the QBR Mirage: the visible numbers are real, but they get mistaken for the whole landscape.

  • The hidden follow-on work is the Rework Shadow. Closure is a status. Completion is a result. A story can be closed and incomplete at the same time.

  • Run the Five-Story Field Test at the bottom to see how much of your own “done” did not stay done.

  • The point is route repair, not blame. Velocity is a useful local signal. It just needs witnesses.


The QBR Mirage

The meeting looked clean. Talia, the product lead, walked the room through a polished quarterly deck. Velocity was up, stories were closed, the sprint trend was green, and the dashboard had that calming corporate glow that makes everyone exhale at once. Leadership saw the chart and heard the story they wanted to hear: the team is moving, the work is flowing, the plan is safe. Nothing in the room suggested danger, which is usually how the most expensive problems prefer to enter.

In the room with Talia was Dev, the delivery lead, who had read the same green dashboard all quarter and reported it upward as a health signal. Three weeks later, Marcus, the engineering manager, ran a quiet analysis on the last two sprints. A defect had come back from something marked done, followed by a clarification ticket, a cleanup task, a support issue, and a dependency fix that technically belonged to last sprint but showed up in this one. No single item looked dramatic enough to challenge the dashboard. Together, they raised the question the dashboard had never been built to answer: did the work stay done?

Talia, Dev, and Marcus are composites, not a disguised client or one remembered meeting. I use them because the pattern is real even when no single incident should carry the whole argument. The names change, the tools change, the industry changes. The move does not. The numbers look healthy, the room relaxes, and nobody checks whether the work survived.

Here is my confession. I have sat in that QBR. I have felt the room exhale at the green dashboard, nodded along with “keep this momentum,” and then planned the next quarter against numbers I should have questioned. I have stood inside that moment in Azure DevOps, scrolling back through completed work and feeling the floor drop when the follow-on tickets started tracing back to stories the dashboard had already counted. I was not careless. I was doing the thing competent people do under a clean chart: I trusted closure as if it were completion.

The dashboard did not lie. A lie would be too simple and too comforting, because then we could blame the report and move on. The harder truth is that the dashboard told only the part of the truth it was designed to see. It saw closure, counted points, and surfaced activity. What it did not show was whether that closure survived contact with users, dependencies, support, production, and the next sprint.

The Route That Buried Bad News (Part A)

The Route That Buried Bad News (Part A)

Mark S. Carroll
·
May 20
Read full story
Part B: The Rebuilt Route

Part B: The Rebuilt Route

Mark S. Carroll
·
May 27
Read full story

This is the same failure this series keeps following, one layer deeper. In Episode 1, a warning died in a busy channel because receiving is not routing. In Episode 2, a handoff became a gap because a closed ticket is not a completed handoff. In Episode 3, a gate lived on the diagram but stopped operating in the work. Episode 4 follows the pattern into the dashboard, where closure quietly stands in for completion and the metric reports progress the system has not actually kept.

The Demo Found the Gap

The Demo Found the Gap

Mark S. Carroll
·
Jun 3
Read full story
The Route That Trained Override Behavior

The Route That Trained Override Behavior

Mark S. Carroll
·
Jun 10
Read full story

That hidden follow-on work is the Rework Shadow. The Rework Shadow is not always scandalous or even obvious in the moment. Sometimes it appears as defects, clarifications, cleanup, support tickets, dependency fixes, reopened items, and small pieces of “just one more thing” that return after the original work was counted. No single follow-on ticket proves the system is broken. Patterns prove something. When enough “done” work keeps coming back, the team may be borrowing capacity from the future while reporting progress in the present.


What Does a Green Velocity Dashboard Fail to Show?

A green dashboard shows movement, and movement matters. Teams need to know whether work is flowing, leaders need signals, and product planning cannot run on vibes or the sacred spreadsheet someone updates at midnight. The problem is not that dashboards exist. The problem is that a dashboard can show visible progress while hiding invisible drag.

A velocity dashboard can tell you stories closed, points counted, sprint trend improved, and status green. Those signals are not worthless. They become dangerous only when we treat them as complete. What the dashboard may not tell you is whether the same work came back wearing a different ticket number, landing in a different queue, or quietly consuming a different team’s capacity.

This is the QBR Mirage. From above, the surface looks calm. Velocity is up, stories are closed, and the status is on track. Underneath, work may still be accumulating as defects, cleanup, clarifications, support, and reopened items. The mirage is not that the visible numbers are fake. The mirage is that the visible numbers get mistaken for the whole landscape.

It helps to notice who builds this route, because no one builds it on purpose. The same practical lens from earlier episodes applies here, and it is a lens, not a validated taxonomy. The Designers are the product leads and managers, like Talia, who choose the metrics and build the dashboard, because they need something visible to show progress. The Inspectors are the delivery leads and PMO functions, like Dev, who read that dashboard as a health signal and report it upward. The Enforcers are the finance and leadership stakeholders who turn last quarter’s velocity into next quarter’s commitment.

Each role is doing its job. None is positioned to see the full cost when rework lands in support, operations, or the next sprint, which is the cost Marcus eventually found by hand. Changing any one actor does not fix the script. The route needs a better companion signal.


Why Is Velocity Useful, but Incomplete?

Velocity deserves a more honest defense than its loudest critics usually give it. Used locally, velocity can help a team forecast, support sprint planning, and notice whether its own delivery rhythm is changing. It can prompt useful conversations about capacity, scope, and sequencing. That is the useful part, and serious teams should not throw it away because someone online discovered a hot take and dressed it up as a principle.

The incomplete part begins when velocity leaves the team context and starts pretending to be a productivity score. Story points are relative, and velocity is team-specific. A team’s velocity is shaped by estimation habits, story slicing, domain complexity, dependencies, tooling, interrupt load, and Definition of Done. That makes velocity useful for local planning and risky for cross-team comparison. Treating velocity as a performance target, an individual score, or a cross-team comparison conflicts with the way story points are defined, which is as relative, team-specific estimates, and it invites gaming the number instead of improving the work.

That last risk has a name. Goodhart’s Law does not need a villain. It just needs a useful measure that becomes the target. When a team is rewarded for the velocity number itself, the rational moves are quiet ones: split stories into smaller points, defer the edge-case testing, lean on optimistic definitions of done. The chart climbs while the durable work slips. I am borrowing Goodhart’s Law as a lens here, not offering it as proof that velocity is doomed. It simply describes what happens to any honest signal once we point planning pressure at it.

So the key sentence is simple. Velocity helps a team plan, but velocity cannot, by itself, prove the work stayed done. That is the nuance too many metric debates lose. The argument is not that velocity is bad. The argument is that velocity is incomplete. Velocity needs witnesses. Quality, stability, flow, rework, and the people living inside the system all need to testify, because one metric should not be forced to impersonate the entire truth.


What Is the Difference Between Closure and Completion?

This is the distinction that makes the rest of the article work. Closure is administrative. Completion is operational. Closure means the ticket moved, the card crossed the board, the field changed to done, the points counted, and the report updated. Completion means the work survived real use. The feature held up, the fix did not reopen the problem, the dependency did not create a second wave, support did not inherit confusion, and the next sprint did not quietly pay the bill.

Closure is a status. Completion is a result. A story can be administratively done and operationally unresolved. This sounds simple until a team starts planning against closure while paying for incomplete work somewhere else. When that happens, the organization can sincerely believe it is moving faster while the team experiences the work as slower, heavier, and more tangled than the dashboard implies.

A story can be closed and incomplete at the same time. The moment it moves to done, the system records closure. The open question is whether that closure survives contact with production, customers, downstream teams, support, and the next planning cycle.

That does not mean every follow-on ticket is a failure. Some follow-on work is healthy, and iteration is how good product development actually happens. We should not punish learning or treat every change as evidence that someone messed up. The point is pattern recognition. If the same kinds of follow-on work keep appearing after the same kinds of closed stories, the route is telling us something. Maybe the Definition of Done is too weak, dependencies are discovered too late, stories are sliced around effort instead of user value, QA is under-powered, or product and engineering are closing different versions of “done.”

My hard-won opinion is that the closure-completion gap usually starts with bad slicing, not bad effort. Teams cut work into pieces that are easy to assign, estimate, and close, then ask the Definition of Done to prove value after the story has already been shaped around task completion. By then, the route is already bent.


How Much Capacity Does Rework Quietly Consume?

Rework is politically dangerous because it hides inside normal work. No one schedules a calendar event called “paying for work we already celebrated.” No dashboard adds a cheerful slide titled “capacity we thought we had but already spent.” Instead, rework disguises itself as routine. A defect here, a clarification there, a cleanup ticket too small to argue about, and a dependency fix everyone agrees is “just part of the work” can look manageable in isolation and still consume the capacity leadership counted as available.

Three local cost categories tend to surface when you trace a Rework Shadow backward. Bystander Burn is the coordination time several people spend seeing, monitoring, and half-investigating work that came back. Investigation Tax is the duplicated effort spent reconstructing why something marked done reopened. Compounding Rework is the cleanup that accumulates when today’s fixes are built on a foundation that was never confirmed, so they quietly generate tomorrow’s follow-on work. None of these show up as one giant visible failure. The team experiences them as friction, context switching, and a sprint plan that somehow always runs optimistic.

Share

User's avatar

Continue reading this post for free, courtesy of Mark S. Carroll.

Or purchase a paid subscription.
© 2026 Mark S. Carroll · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture