A Brief History of Little's Law

Its evolution through mathematical proofs

Aug 18, 2025

Fig 1: Proofs of Little’s Law - a timeline

In our last post, The Many Faces of Little’s Law, we showed that Little’s Law has undergone several upgrades since the original proof. by Dr. Little in 1961.

The original proof was ground-breaking and Dr. Little is rightfully associated with the law, but over the years, as this field has evolved, the association of the law with its original statement and proof has also resulted in several myths about what Little’s Law is and where and when it can be applied.

Even though Little’s Law is most often encountered in conjunction with queueing theory, modern proofs of the law are independent of queuing theory. While it is most often applied in analyzing the flow of work in an operational setting, the core principles underlying Little’s Law have allowed it to be extended naturally to analyzing the economics - costs, risks, and rewards arising from that flow of work.

The law applies to certain classes of stochastic processes, and deterministic systems are a special case. But it does not depend on the internal structure of these system, just their observable input-output behavior1. It applies equally well to linear or non-linear systems, simple or complex systems, and even certain types of chaotic systems.

However, in typical applications, this generality is not exploited mainly because the law is largely assumed to apply only under the stringent conditions that Dr. Little proved in 1961.

Sample path analysis2, a mathematical technique pioneered by Dr. Shaler Stidham in proving the first deterministic version of the law in 1972 is a core theoretical technique underpinning the evolution of the law.

It is of particular interest in software product development because Dr. Stidham showed that many of the perceived limitations that are typically associated with Little’s Law - requiring steady state equilibrium, stationary distributions, and low variability - are not intrinsically needed to apply Little’s Law in operational contexts.

This has practical implications. Our first post laid out the theory in broad strokes and our deep dive article3 goes into more details. We will discuss many more specific applications in this series.

But before we begin, this post establishes the historical context, exploring the evolution of Little’s Law through its mathematical proofs. These are powerful generalizations of Dr. Little’s result, developed over 30+ years by mathematicians and moved it from an empirically observed mathematical regularity in operational settings into a foundational law of systems analysis.

Examining this history gives us insights into the why is is a subtle result, and also why it was possible to loosen restrictions and generalize the law to what it is today. This will be crucial context as we dig deeper into the details of applying the law in its more general forms.

Empirical Origins

Little’s Law began its life in the mundane world of operations management—as an empirical regularity observed in retail stores, call centers, and other service operations. Observers noticed that when more customers came in and stayed longer, the stores got more crowded. That seems obvious. What was less obvious was that the relationship between the averages of arrivals, the time customers stayed, and the number of customers in the store at any given time were mathematically related.

The mathematical relationship observed was:

\(L = \lambda W\)

Where λ is the rate at which customers arrive, W is the average time a customer spends in the store, and L is the average number of customers in the store.

This relationship between averages was confirmed empirically time and again and widely assumed to be true —but no one knew why this relationship between these average values should hold.

As we will see in this series, that is anything but obvious. Behind it is a profound and general truth about the dynamics of any system where entities enter and leave over time - a truth we now know as Little’s Law.

Why this is important

Empirical observation often reveals correlations—associations where two quantities move in concert. These correlations can be a starting point for causal reasoning, prompting us to search for underlying mechanisms that might explain the linkage. However, correlations do not always indicate causation, as they may be influenced by confounding variables.

It is important to note that the empirical observations leading to Little’s Law are much stronger than statistical correlations. They describe an equation relating key variables. When variables are linked by an equation, the space of plausible explanations for change narrows dramatically.

For example, if the average number of customers in a system increases, there are only three possibilities: the arrival rate increased, the average time each item spent in the system increased, or both. No other explanation is consistent with the equality—there are no confounding variables.

Knowing that a relationship like Little’s Law holds gives us a robust framework for causal reasoning about observable system behavior. It doesn’t tell us what will happen in the future, but it constrains what can happen. Whenever the equation holds, these three quantities remain bound by the relationship between their averages.

We have a powerful causal reasoning tool if we can rigorously prove the conditions under which this equation holds for any system with inflows and outflows. That is why this result—and the man who proved it—are rightly honored with the name Little’s Law.

Dr. Little proves the law.

In 1961, Dr. John Little4, at the MIT Sloan School, gave the first mathematical proof of this empirical relationship. His proof method used the most popular tool for solving these problems: queueing theory, developed decades earlier to model congestion in telephone exchanges 5.

The precise result he proved was the following:

In a queuing process, let 1/λ be the mean time between the arrivals of two consecutive units, L be the mean number of units in the system, and W be the mean time spent by a unit in the system. It is shown that, if the three means are finite and the corresponding stochastic processes strictly stationary, and, if the arrival process is metrically transitive with nonzero mean, then L = λW.

Queueing theory typically uses probabilistic models of arrivals and departures, and most results here depend on specific assumptions about how stochastic processes behave.

What made his result so remarkable was how broadly it applied. He showed that it held under a broad set of conditions. Specifically, he showed that it held regardless of the types of the probability distributions. That was an unexpectedly general result for queueing theory.

It pointed to something deeper—something structural about input-output systems. No matter how arrivals and completions fluctuated, or whether they were regular, random, or correlated, the strict equality of Little’s Law still held.

This eliminates randomness as the underlying mechanism that governs why the law holds. It also removes variability as a driving force. While variability still affects the behavior of the systems—how large the values of these averages are and how much they fluctuate—it does not invalidate the relationship.

In fact, his proof showed that what happens inside the system doesn’t matter to the relationship at all!

From a queueing theory perspective Little's Law, it turns out, describes the dynamic behavior of arrival and departure processes when they interact in time. It says that no matter what these processes are, the dynamic behavior of the queue, expressed via the relationship between these averages, is constrained by the above equation requiring only stationarity of probability distribution as a non-trivial condition.

Little’s proof was a special case

Dr. Little's proof required a set of conditions on these averages—specifically, that they were well-defined and finite. These are relatively mild conditions. But his proof also required that the underlying processes were strictly stationary: that is, the long run statistical properties6 of the system did not drift over time, and the arrival process was ergodic (this is the metrically transitive part). These are significant caveats. They imply that, as proven initially, the formula in Little's Law applies only to systems operating in a very well-behaved and stable equilibrium.

However, since Dr. Little's original paper, newer proofs of the law have shown that his argument relies on stronger assumptions than necessary. Specifically, they show that the requirements of stationarity and ergodicity in Little’s proof are not fundamental to the law itself—they are consequences of the particular proof technique he used. As such, they are sufficient conditions for the law to hold, but not necessary.

There’s value in understanding why.

Little’s Law is fundamentally a statement about time averages—quantities that can be directly observed in real-world systems over time. Dr. Little, however, used probabilistic techniques to prove a relationship between the expected values of random variables, which are called ensemble averages. Unlike time averages, ensemble averages are computed across the entire probability distribution of a variable and cannot depend on how the system evolves over time.

Little’s proof established the equality between these ensemble averages. To apply this result to real-world systems, one must assume that the ensemble averages and the time averages of the corresponding processes converge to the same value. This convergence is guaranteed when the underlying stochastic processes are stationary and ergodic—hence the need for those assumptions in Little's original proof.

But this doesn’t mean that stationarity and ergodicity are required for the law itself. It is entirely possible to prove Little’s Law directly for time averages, bypassing probability theory altogether.

This is precisely what Dr. Shaler Stidham of Cornell did in 1972 7.

Stidham’s Proof of Little’s Law

Stidham demonstrated that Little's Law was even more general than previously believed. His proof was purely deterministic and did not rely on any probabilistic or queueing theory techniques. It was based on standard techniques from continuous analysis, using time averages on sample paths for queue length, calculated over finite intervals. It eliminated the requirements for stationarity, ergodicity, and related assumptions.

The specific statement he proved was the following :

Consider a system observed over a sufficiently long interval 0≤ T< ∞. If the time-average arrival rate and the average time each item spends in the system over this window each converge to finite limits—call them λ and W, respectively—then the time-average number of items in the system also converges to a finite limit L, and the relationship L = λW holds.

In contrast to Dr. Little’s original formulation, which relates the expected values of probability distributions, this version establishes a relationship between the limits of time averages observed along a sufficiently long sample path over which time averages are calculated.

This is a significant generalization, but it also needs to be read carefully to understand precisely what Little’s Law is asserting—and the specific conditions under which it holds.

Stidham’s proof did not require even queueing theory assumptions like a defined arrival or departure process. So, technically, he showed that Little's Law is much more than a result that applies only to queueing systems. He showed that Little's Law was a general statement about the long-run dynamics of any input-out system where entities arrive at and depart from the system over time.

The only requirements for the law to hold were that the three averages in the equation were well-defined and converged to a finite limit when measured over sufficiently long observation windows.

His proof reinforces the idea that the formula L = λW as originally proved by Dr. Little applies to systems in equilibrium—albeit under much weaker conditions than those in his original proof. But Stidham’s proof technique, sample path analysis, opened up ways of exploiting a key conservation principle behind the law that functions as a universal invariant governing the law.

As we’ll see in this series, this allows us to apply the law under both equilibrium and non-equilibrium conditions. This is what allows Little’s Law to bridge from equilibrium environments to dynamic, evolving, and complex systems like software development which often operate far from equilibrium.

Starting with our next post, we will examine Stidham's specific result more carefully, using his ideas to show how when measured appropriately, Little's Law can always be empirically validated and used in software development.

The General Form of the Law

Stidham’s extensions of sample path analysis led to a very general version of Little's Law where instead of indexing on items, we arbitrary time varying functions of items with some weak restrictions.

A version of this general form was independently proven much earlier in a somewhat different context by Brumelle8 but Heyman and Stidham9 showed that the result could be obtained by directly applying the sample path arguments used to prove L = λW and directly established the connection to Little’s Law.

The general form of the law is stated as H=λG where H and G are direct mathematical analogues of time average and sample averages L and W except defined over time varying functions over processes. This allows us to generalize Little’s Law to economic quantities tied to processes such as costs, risks, revenues, profits etc.

As we discuss in our deep dive on Little’s Law this generalization is substantially more complicated to even setup and describe compactly. However, this version of Little's Law, called the Generalized Little's Law, has critical applications to modeling the economics of flow and will be critical in understanding how to measure the relationship between the flow of work and the flow of value - something we have been exploring in some detail in this series.

But before we can fully appreciate that version and what it says, we have to understand the importance of sample path analysis as it underlies both Stidham’s proof and its generalizations. This is what we will explore further and apply specifically in a software development operations management context, in the next few posts.

At this point, Little's Law is considered to be much more than a result in queueing theory. It has earned something close to the ontological status of a natural equilibrium law that reveals a fundamental relationship that applies to very general forms of stochastic processes.

This is a powerful property that can be exploited in many different ways once you establish it, and this has been the real reason why Little's Law has been so useful in every domain that it has been successfully applied so far.

Post Script 1: The Throughput Formula

It’s worth pausing to explain why the version of Little’s Law commonly used in software today looks quite different from the one we’ve discussed so far.

Dr. Little’s original proof applied to stationary systems with finite averages—conditions that most manufacturing processes in steady state typically satisfy. In practice, this means that under equilibrium, arrival rates can be assumed equal to departure rates. This allows us to write:

L = 𝑋.W

where 𝑋 is the departure rate or throughput. Rearranging this and renaming the terms gives us the familiar throughput formula:

Throughput = WIP / Cycle Time

This is the formula for Little’s Law that was originally used in manufacturing contexts and its utility was highlighted in Hopp & Spearman’s seminal textbook 10. There it is a natural fit, where factory managers focus on optimizing WIP and cycle time to meet a fixed throughput goal derived from stable production orders. It was imported into Lean software development, originally by Tom and Mary Poppendieck11, and operationalized in the software Kanban community by David Anderson12 and Dan Vacanti13, and more recently in the enterprise contexts by Dr. Mik Kersten14.

But in software development, direct application of this formula for operational metrics is problematic. Much of software development happens in an environment where demand is constantly shifting, capacity cannot be accurately measured, and variability in the work is endemic and often irreducible. Much of the work happens at the boundary between two complex processes (the customers and a software provider) and involve long running concurrent processes involving humans and machines on both sides.

This introduces non-trivial features into the problem of measuring flow, that are hard to fully capture with the steady state throughput formula, but nonetheless are representable using the more general forms of Little’s Law.

In our treatment of Little's Law going forward, we’ll explore the more general forms of Little’s Law as the basis of measuring flow in software development, and minimally, this means starting with the arrival rate form and adopting Stidham’s sample path analysis to measure flow metrics.

For example, one of the key flow metrics that is now recognized as needing specific focus in software contexts is work item age. As we will see, under the more general framing of Little’s Law, this metric falls out quite naturally as a core component of residence time which is directly represented in the finite version of Little’s Law.

Many of the adaptations of flow management techniques derived from Lean will still apply in terms of day to day operations, but the way we measure the flow in general systems contexts and its impact will become significantly more streamlined and powerful. As a result we will also gain new insights into how we can use these measurement to further evolve and improve the techniques we use in day to day operations.

Post Script 2: Further reading

In 2011, Dr. Little published a survey15 of all the developments related to Little's Law since its original proof, on its 50th anniversary. Most of this series started from a close reading of that paper and its source references. I highly recommend it if you’re interested in a more technical, yet accessible, presentation of the material in this post with an excellent first person account from Dr. Little himself on his journey with the law and its proofs.

Interestingly, in this survey article Dr. Little starts with Dr. Stidham’s result as the canonical statement of Little’s Law, more or less ignoring the statement that he originally proved!

This is not to say that his result has somehow become less useful. It absolutely is in many operational contexts where stationarity is guaranteed. But the key lesson you will take away from this survey article is that it is not required, and that this frees us up to apply this law in vastly more general settings.

Our deep dive article is another reference to go deeper here, with a specific focus on the things we will find necessary in adapting this material to operational use in software development.

References and Footnotes

More precisely, it applies to stochastic processes built from point processes (events) — arrivals and departures, for example — from which we construct the random variables of interest.

Sample path analysis of queueing systems, El-Taha & Stidham, Springer. 1999

A deep dive into Littles Law, Krishna Kumar, The Presence Calculus Project Documentation, 2025

A proof of the queueing formula: L = λW, John D.C Little, Informs 1961.

Queues, inventories, and maintenance : the analysis of operational system with variable demand and supply, Philip Morse,Wiley 1957.

Note: Philip Morse, in this seminal textbook, challenged his readers to find a situation where the law was not true - a challenge that his doctoral student John Little took on.

This includes the average and all higher order moments like variance, skewness etc. It is a very strict set of conditions!

A last word on L = λW, Shaler Stidham, Informs 1972.

Note: We have taken a few liberties in stating this version of the law in our post to keep the exposition clear. The more precise version can be found in our deep dive paper.

On the relation between customer and time averages in queues, Brumelle, Journal of applied probability, 1971.

The relation between customer and time averages in queues, Heyman & Stidham, Informs 1980.

Factory Physics 3rd Ed, Hopp & Spearman, Waveland Pr, 2011

Lean software development, an Agile toolkit, Mary and Tom Poppendieck, Addison-Wesley, 2003

Kanban: successful evolutionary change for your technology business. David J Anderson, Blue Hole Press, 2010.

Actionable agile metrics for predictability 10 Anniversary Edition, Daniel Vacanti, ActionableAgile(R) Press, 2025.

Project to Product: How to Survive and Thrive in the Age of Digital Disruption with the Flow Framework, Mik Kersten, IT Revolution, 2018.

Little’s Law on its 50th anniversary, John Little, Informs, 2011.

The Polaris Flow Dispatch

Discussion about this post