Resisting the siren song of better estimation in software development
When software delivery seems unpredictable, one of the common responses is to blame poor planning and estimation.
So we put more processes around requirements writing, estimation ceremonies, making developers commit to deadlines based on these estimates, etc.
The ROI for this type of effort is usually low because the problem is usually not with your planning or estimation processes.
Thanks for reading The Polaris Flow Dispatch! Subscribe for free to receive new posts and support my work.
The root cause is that most software development processes are unstable in their natural state.
Your planning and estimation process does not, and cannot, account for the random delays introduced while executing work.
Unless the system is stabilized first.
Fundamentally, in an unstable system, there is little correlation between the wall clock time required to deliver a unit of work and the effort needed to complete it.
A task estimated as “small” has just as much of a chance of taking the same wall clock time to deliver end-to-end as a “large” one.
This makes estimates and plans largely unreliable when made against highly unstable systems.
What makes a system stable?
First, you need a well-defined notion of “system.”
This may be an entire company, a business unit, a product line, a development team, or even a single CI/CD pipeline.
Then, given a unit of work, an observation window, and observation frequency, the system is stable if the following four conditions1 are satisfied.
The rate at which work is started equals the rate at which work finishes.
All work that is started finishes.
The number of units of work in the system is the same at the beginning and end of the observation window.
The average age of the units of work in the system is neither increasing nor decreasing unboundedly.
The unit of work may be a story or task in the development pipeline, a larger feature or capability on the roadmap, a portfolio of products, a release, a branch with unmerged code, a pull request, a build, etc.
If your system is unstable, one or more of these conditions will be false. Otherwise, it is stable.
While the conditions above characterize stability, when applying it in a software development context, we will be much more interested in using this to understand why a system is unstable.
When a system is unstable under these definitions, it signals it is taking on more work than it can complete during the observation window.
This is the default state for most software development environments and is one of the primary reasons it is hard to execute software development work reliably based on plans and estimates.
A straightforward way to quantify and measure stability is a necessary first step in improving software product development flow, which in turn, is a prerequisite for reducing the unpredictability of software development.
Why should I care about stability?
Analyzing instability is a fundamental building block in helping us answer the question,
“How can we keep work moving smoothly through a system without unwanted delays?” - i.e., the thing we informally call “flow.”
For example, when a system is at or near its maximum capacity to deliver, it will start exhibiting signs of instability, even with relatively small changes to demand. This was the gist of the arguments we presented when we discussed the impact of utilization on wait times in queuing systems.
But the instability signals will flash red, even when the system is well below capacity, if work in the system cannot progress promptly for any reason.
Especially since we don’t have precise tools to understand the capacity to deliver work in software and since the number of factors that routinely cause work to stop making progress are varied and numerous; for example,
Poor information flow.
Multi-tasking at work.
Handoffs between functional silos.
Hidden and emergent dependencies.
Overcommitment by teams.
Work that is started but abandoned due to shifting priorities.
All of these are real-world causes of instability, and they will be reflected in violations of one or more of these conditions, provided you measure them.
So these conditions, when turned into measurable signals, are very valuable as diagnostic tools in software development.
Measuring system stability with flow signals
To measure stability, all we need is a clear definition of what it means for a unit of work to be “started” or “finished.”
With that, we can determine what things like the “number of units in the system” and “age of the units in the system” mean.
Once we define the system and its parameters, based on the definition of stability, we know the five Flow Signals we must measure over the observation window.
The rate at which work starts.
The rate at which work finishes.
The average time each unit of work spends in the system
The average number of units of work in the system
The average age of units of work in the system.
For a specific type of work, how you measure these flow signals, the typical observation windows over which stability is assessed, and the accuracy of these measurements will vary quite a bit.
These specific proxy measurements for a unit of work are your Flow Metrics.
We will have much more to say about how to define and measure accurate flow metrics for different units of work in upcoming posts.
To reduce instability in the flow of work in software development, all of these need to be measured, visualized, and tackled, ideally, as soon as they occur.
Learning how to interpret these instability signals, ideally in real-time, is a powerful tool for proactively managing the flow of work, especially in large and complex software development environments.
Another powerful property of these definitions is that the notion of stability is composable, applying to systems at varying levels of granularity of work.
By defining the system and units of work appropriately, we can derive flow metrics based on the same set of five flow signals for portfolios of work, product features, stories and tasks, code changes, pull requests, etc., with associated notions of stability.
By suitably defining the systems and units of work, you can stabilize flow across many systems and sub-systems and their associated processes that comprise a typical product development organization.
You can stabilize a larger system by stabilizing the sub-systems that it is composed of and make inferences about the stability of a larger system by studying the stability of its sub-systems.
Once you have the tools to model and measure system stability at different levels of granularity, you can quickly decide when any system or sub-system under observation is unstable, get to the root causes, and this figure out how to move it closer to a stable state.
In software, this is not a set-it-and-forget-it process.
It must be continuous and ongoing, making establishing and maintaining flow in software development challenging but quite solvable.
But this stuff does not apply to software!
The thing to note is that definition of stability is very general and makes very few assumptions about the nature or granularity of the work being performed.
In particular, it applies to analyzing software processes, despite the widely held belief that they are not.
Work in software is both divisible and interruptible.
Because of this, a software development process may violate the four conditions above in ways that you will typically not see in many other contexts, such as industrial production or service processes, where work is relatively more repetitive and has less intrinsic variability.
While these factors make the task of stabilizing a software process different from the process of stabilizing an industrial production process, the analytical tools of stability can still be brought to bear quite effectively here.
We’ll show you precisely how we can do this in upcoming posts.
The most important conclusions so far are,
Stability, or its lack thereof, is measurable precisely and even in real-time for complex systems using simple and intuitive flow signals.
Analyzing the stability of a system and its sub-systems and processes can give deep insights into the impediments to the smooth flow of work in a product development pipeline.
This needs to be a continuous, ongoing process.
In the next few posts, we will jump into the details of how we can measure and reduce instability, paying careful attention to how the unique characteristics of software development impact how we approach this problem.
For those familiar with it, these are the conditions for stability in Little’s Law from queueing theory, and a few more technical nuances and caveats need to be added to this definition. But for this informal discussion, this is good enough.
Usually, you will likely see Little’s Law introduced with a formula relating throughput, cycle time, and WIP, as we did in the Iron Triangle of Flow.
But in fact, the actual utility of Little’s Law in software lies in the pre-conditions for the formula to be valid. Since most software development processes are unstable, the law usually does not hold. The formula has its uses, though, as we will see later in the series.
Unlike most applications of Little’s Law, in software, we are primarily interested in using it to understand the behavior of unstable systems, i.e., the case when the law itself does not hold!