6 Comments
User's avatar
Tristan Slominski's avatar

Thank you again, interesting XP analysis.

Two things come to mind at this point in the series.

The first one, is there a third way of working that stabilizes flow? Seeing Kanban explained using sample path analysis, and seeing XP explained using sample path analysis, is there a way to describe yet another way of working working backwards from a stable process into rules that we haven't stumbled upon in the industry yet?

Second, it is all well and good to see these arguments made for L(T), _but_ the business could care less about L(T). The business usually cares about value and other things, along the lines of H(T). So, I'm wondering if any of these Kanban/XP L(T) arguments translate at all to Kanban/XP stabilizing H(T). Yes, I understand Little's Law applies to H(T). What I am wondering about is whether there is a line of reasoning that goes from Kanban/XP constraints -> stabilize L(T) to Kanban/XP constraints -> stabilize H(T). I don't necessarily see why stable H(T) would emerge from Kanban/XP constraints. I think my discomfort comes from f(t) perhaps not being known live/instantenously?

(edit s/Scrum/XP, my bad)

Krishna Kumar's avatar

One last point. The two cases discussed in the last two posts are two distinct mechanisms by which flows stabilize and there are only 2 of them.

The argument is that they represent two different cases by which we mathematically prove Little’s Law. The first case is if WIP starts and ends at zero, and the second case is if it doesn’t. I call these timeboxed flow vs continuous flow just to align it with how people think of these processes, but the technical difference is whether we reset WIP to zero periodically as part of the core process or if we try to maintain continuous flow.

But given these two building blocks there are many ways to combine policies to create a stable process and the post gives a high level sense of how these might work using the pure play versions of each.

In stochastic process theory, processes where the state resets to zero are called renewal processes and there is a whole separate theory for such processes.

Krishna Kumar's avatar

Tristan: I probably caused some confusion with that first comment. Ignore it. It’s because sometime between the time I wrote the first post on Little’s Law I changed my notation to denote H(T) as the shared symbol for cumulative presence mass and my response reflect my current mental model not what I wrote in this post. Your interpretation is correct. What matters is value. My answer in point 3 addresses this.

I will go back and harmonize the notation.

Krishna Kumar's avatar

Tristan - first of all, thank you for reading through this series and giving me this level of feedback. I kind of paused writing here back in November because I felt like I was shouting into the void :) Its good to see some questions :)

There are a several things you are bringing up here.

1. Stabilizing H(T): we can never stabilize H(T) in time because it is a cumulative quantity. It will only go up unboundedly.

2. Stabilizing L(T) = H(T)/T: I focus so much on the mechanics here because currently in the industry *no one* actually measures L(T). Everyone assumes limiting WIP means placing a bound on N(T) and measuring L(T) means taking a sample average of these WIP values over time. Every single flow metrics tool on the market makes this mistake.

In fact what this framing is saying that stabilizing L(T) in whatever way you can achieve it is tantamount to stabilizing the system. So your intuition is correct. There are infinitely many possibilities for this that you can use in a context aware fashion provided you maintain focus on stabilizing L(T) and use Λ(T) and w(T) as levers.

Some of them may even become reusable "standard methods", but I believe the real value is in first learning how to craft your own method for your own context using these ideas from first principles. It is general enough to do that.

3) Re; value - L(T) is in fact the most important economically important metric we can have if we assume time-value is the thing we are interested in managing rather than throughput. In the general H=\lambda.G case, H(T) is a real-time measurement of the impact of time accumulation of some measurable quantity in the system. This is the real domain of the Presence Calculus.

The L=\lambdaW form interprets the value of L(T) is the average number of items experiencing delays in the system per unit of time. Conceptually and mathematically, all of this generalizes cleanly if H(T) is interpreted as the accumulation of the effects of superposing arbitrary functions of time (think of them as cost functions or reward function) associated with the items.

This gives you a whole complete set of ways you can model cost/reward style optimization problems as flow problems. As a simple example if we can attach cost-of-delay functions to items (arbitrary time value functions) the H(T) becomes accumulated cost and H(T)/T = H becomes a cost based flow metric. We would make decisions not based on the raw magnitude of delay, but the overall cost of delay of WIP. This leads to completely different policies and completely different ways of treating WIP than otherwise (for example: pre-emption and multi-tasking may be perfectly fine, if we end up with policies that demonstrably minimize overall cost of delay). We have exclusively focused on process whereas in reality policies have a much greater impact on flow.

4) If we merge sample path analysis with classical Forrester style systems dynamics, we get a sample path approach to modeling complex systems dynamics with both flow and accumulation nodes and feedback loops.without assumptions of closed form rate formulas as classical approaches do. The mathematics is very compatible at a very deep level. The list of ways we can use this is so huge that it is difficult to put a neat box around this. I am still exploring what this means.

5) Finally - stability: We have traditionally looked at stabilization as the end goal for flow analysis. My current thinking is that the real power of Lilttle's Law is in the ability for fine tuned steering of unstable systems that may never stabilize. Our focus on stability comes from the manufacturing lens where we value predictability, throughput, promise dates etc.. Those are still important in many contexts. But I think with the tech shift we will experience in the next couple of years, we will need flow anlaysis techniques that a able to work in high variability and high instability VUCA environments more than ever.

My argument is that sample path analysis and Little's Law in its general H=\lambda.G form are precisely the tools we need for building these sort of measurement and steering systems.

Right now a lot of my focus is on figuring out what these measurement systems will look like in specific use cases and understanding where we use them for decisive advantage in the AI augmented software development world we are transitioning to. I think it will be very useful here as all the existing assumptions of how processes should work in our industry are going to be upended.

Now that I have at least one person reading these posts carefully, I'll work up the enthusiasm to write more :)

Tristan Slominski's avatar

"We would make decisions not based on the raw magnitude of delay, but the overall cost of delay of WIP."

I think this might be the essence of my question. If I care about the cost of delay function instead of the raw magnitude of delay in the system, is there a predictable relationship between controlling the raw magnitude of delay (via Kanban/XP) and the behavior of my cost of delay function? This is where I'm wondering if there is a gap.

Let's assume that I have Kanban/XP levers to control the raw magnitude of delay in the system based on raw-magnitude-of-delay-focused decisions.

Do those Kanban/XP raw-magnitude-of-delay-focused decisions predictably impact the thing I care about, the cost of delay function instead? I think the answer is no.

But, if I understand your answer correctly, I think you're saying is that what I need to do is a Kanban/XP approach where I make cost-of-delay-focused decisions instead, and that will impact cost of delay function. Yes?

Krishna Kumar's avatar

Precisely! This is the fundamental flaw in how we think about the relationship between the flow of work and the flow of value today.

Kanban in particular optimizes the flow of work and claims that will improve the flow of value.

This is true only if the unit economics of the work are well enough defined that we can marginalize it out of the equation or reduce it to priorities etc.

If a good cost of delay function or a decent proxy is available then measuring the cost of of delay in real time and making scheduling decisions on that basis will lead to flows that look janky (technical term :) and irregular when measured on the pure delay basis, but might actually be better to produce lower cost outcomes.

It depends on how you manage it.