Operational Learning: Exception Design

Unclassified exceptions are how standards die; not in a single, dramatic way...but one reasonable decision at a time.

Most organizations know how to handle an exception in the moment. Someone makes a call, the issue gets resolved, the work continues. That part isn’t broken.

What’s broken is what happens after. The exception gets solved and then it disappears. Nobody asks what it meant. Nobody checks whether it happened before. Nobody looks at it as anything other than a problem that’s been handled.

Unclassified exceptions are how standards die. Not loudly. Not in a single dramatic failure. They die the way most operational failures happen: one reasonable decision at a time, with nobody keeping score.

And every time one disappears, the information it carried walks out of the room with whoever solved the problem.

* * *

A regional sales director is on the phone with the VP of procurement at one of her company’s top-ten accounts. The relationship is four years old. The account represents significant annual revenue. And the VP is asking for something specific: a delivery commitment two weeks ahead of standard lead time. His team has a project with a hard install date. Materials need to be on-site early or the timeline collapses.

The ask isn’t unreasonable. She knows this customer. They don’t manufacture emergencies. This is a real constraint, and the relationship has earned the right to make the ask.

She calls operations. Explains the situation. Asks if they can make it work.

They can. It takes some rearranging. The production scheduler shifts a few jobs, pulls forward a materials release, coordinates with logistics on an adjusted pickup window. Nobody stays late. Nobody cuts corners. The team absorbs it, the way good operations teams do when the ask is reasonable and the asker is credible.

The customer is happy. The sales director looks great. Operations moves on to the next problem.

* * *

Six weeks later, a different rep makes a similar commitment to a different account. This time he doesn’t call operations first. He promises the date in the meeting, because word had gotten around. Not in a memo. Not in a policy update. It traveled the way these things always travel: a win story mentioned in a team huddle, a comment over coffee about how operations came through on that big account, a growing sense across the sales floor that accelerated delivery was something the company could do if you asked the right way.

Operations scrambles again. Makes it work again.

By end of quarter, the planning team is building production schedules against published lead times that no longer reflect what sales is actually promising in the field. The gap doesn’t show up in any system. It shows up as pressure. Production feels squeezed but the metrics look normal. Planning’s forecasts are technically correct but operationally useless. Then a materials shortage hits in week ten because capacity had been quietly overcommitted for two months. A second-tier customer misses a delivery window. Not because of the shortage itself, but because the buffer that would have absorbed it had already been spent on accelerated commitments nobody was tracking.

Sales thinks operations is getting slower. Operations thinks sales is making promises they can’t keep. Both are wrong.

The standard just moved, one handshake at a time, and nobody noticed.

The problem is not the exception.

That first call was a good decision. A capable leader read a real situation, weighed the relationship, assessed the risk, and asked for something reasonable. The operations team found a path and delivered without compromising other commitments.

The problem is that the organization had no way to learn from it.

Nobody connected that first accommodation to the identical exception six weeks later. And when the materials shortage hit, nobody could trace it back to the accumulated weight of commitments that had been made outside the plan, because those commitments existed only in the memories of the people who made them.

The result is what you might call a shadow standard: two operating realities running at the same time. There’s the documented system, where lead times are published, planning builds on those numbers, and capacity decisions follow accordingly. And there’s the practiced system, where commitments flex based on relationships, urgency, and informal knowledge of what operations can actually absorb.

The trust erosion that follows is lateral, not hierarchical. Sales and operations both acted reasonably within the information they had. When a missed delivery surfaces and nobody can explain the real cause, the space between those two realities fills with blame. Not because anyone lied. Because the organization never made the exception visible, and without visibility, there’s no shared understanding of what actually happened.

Most organizations stop here. The problem got solved. The quarter closed. There’s always another fire.

That’s exactly where the real cost hides. Because exceptions aren’t just operational noise. They’re the most honest feedback your operating model produces about where it no longer fits the world it’s operating in.

That first call revealed three things at once: the standard lead time may no longer match what key accounts need, a top customer is asking for flexibility the published process doesn’t offer, and operations has more capacity to flex than planning models account for. Each is a data point. Together, they tell a story about where the operating model is drifting from the conditions it was designed for.

That story only gets told if someone captures it. And in most organizations, nobody does. The exception gets resolved. Everyone moves on. The information dies in the space between the phone call and the next task.

The question isn’t whether your organization has exceptions. Of course it does.

The question is whether those exceptions are teaching you anything. Whether you’re treating them as disposable incidents to survive, or as inputs that reveal where your standards are aging, where your assumptions have shifted, and where the gap between what you’ve designed and what the work actually demands is getting wider.

Plenty of organizations have built the discipline to classify exceptions, to route them through escalation paths and log them in tracking systems. Classification matters. This piece is about the step that comes after. The step almost nobody takes: looking at what the exceptions are telling you, as a pattern, about the fitness of the system itself.

So what does that learning discipline actually look like?

It starts with a commitment that sounds simple and turns out to be hard: every exception gets seen. Not every exception needs the same treatment. Not every one triggers a review or a policy change. Every one gets acknowledged, which means someone pauses long enough to ask what it means.

The first questions are straightforward. What type of exception is this, and what drove it? Not the surface reason, but the underlying condition. That sales director’s customer didn’t just want faster delivery. He had a project constraint that the standard lead time couldn’t accommodate. The driver is the project constraint. The exception was the symptom. And the type tells you where the pressure is coming from: customer-driven, internal process workaround, capacity limitation, supplier deviation. Each one points to a different part of the operating model.

This is where most organizations stop, and it’s exactly where the learning begins. What’s the risk if this repeats? One accelerated delivery is absorbed easily. Five in a quarter starts distorting capacity planning. Twenty rewrites the operating model without anyone’s approval. The risk isn’t the single exception. It’s the compounding effect when the same exception recurs and nobody’s watching the accumulation.

And the question that matters most: what does this exception tell us about the fitness of the standard? Is the standard still right and this was a genuine one-off? Is the market signaling that the standard needs to evolve? Or is there a segment of customers that needs a different tier of service entirely? Those are three very different answers, and each one leads to a different operational response. None of them become available until someone asks.

If someone had flagged that first accommodation and its context, the second instance six weeks later becomes a pattern, not a surprise. Planning gets informed before the gap becomes a crisis. The conversation shifts from blame to design: do we build a fast-track tier for qualified accounts, update the standard lead time, or hold the line and invest in managing expectations differently? Research on healthcare systems has documented the same dynamic. Tucker and Edmondson found that hospitals consistently solved problems in the moment but failed to learn from them at the system level, because the organization’s structure didn’t make the pattern visible to anyone with the authority to change it.

And this is where it gets genuinely interesting at the portfolio level. One classified exception is useful. Three months of classified exceptions, reviewed together, is strategic intelligence. You start seeing which standards are under the most pressure. Which functions are generating the most workarounds. Where the shadow standard has quietly replaced the documented one. That pattern data isn’t just operational housekeeping. It’s input for how the business needs to evolve. It’s the operating model telling you, in real time, where it’s aging.

The trade-off is real, and it deserves honest framing. This discipline adds friction to a process that currently has none. When that sales director calls operations next time, someone will need to spend five minutes noting what happened and what it might mean. People will push back. It will feel like overhead. Sidney Dekker’s work on organizational drift describes precisely this failure mode: systems don’t degrade through dramatic breakdown, they degrade through small, locally reasonable adaptations that accumulate beyond anyone’s line of sight.

The alternative is what we just lived through. Invisible drift. Standards that decay without anyone choosing to change them. Functions blaming each other for a gap that nobody created on purpose. A five-minute pause when the exception is fresh is vastly cheaper than rebuilding cross-functional trust after two quarters of operating from different versions of reality.

The organizations that do this well share a common trait. They don’t treat exception capture as a compliance exercise. They treat it as the question the organization asks itself every time reality departs from the plan: what just happened, and what is it trying to teach us?

Every organization has exceptions running through it right now. Sales commitments that flex the standard. Procurement workarounds that nobody documented. Production adjustments that solved last week’s problem and are quietly becoming next quarter’s assumption. The exceptions themselves aren’t the risk. The risk is that each one carries information the organization needs and will never see.

Where are exceptions happening weekly in your operation, and what are you learning from them, if anything?

Sources

Anand, G., Ward, P. T., & Tatikonda, M. V. (2010). Role of explicit and tacit knowledge in Six Sigma projects: An empirical examination of differential project success. Journal of Operations Management, 28(4), 303–315.

Argyris, C. (2004). Reasons and rationalizations: The limits to organizational knowledge. Oxford University Press.

Dekker, S. (2011). Drift into failure: From hunting broken components to understanding complex systems. Ashgate.

Edmondson, A. C. (2019). The fearless organization: Creating psychological safety in the workplace for learning, innovation, and growth. Wiley.

Hollnagel, E. (2014). Safety–I and Safety–II: The past and future of safety management. Ashgate.

Lapré, M. A., & Tsikriktsis, N. (2006). Organizational learning curves for customer dissatisfaction: Heterogeneity across airlines. Management Science, 52(3), 352–366.

Tucker, A. L., & Edmondson, A. C. (2003). Why hospitals don’t learn from failures: Organizational and psychological dynamics that inhibit system change. California Management Review, 45(2), 55–72.