When we left our protagonists at the end of Episode 1, the Advisor had explained to the Entrepreneur the importance of a causal model that would help connect output metrics such as throughput, cycle time, and customer lead time to controllable inputs such as arrival rate, service rate, and utilization and explain how changes in the inputs cause changes in the outputs.
In particular, the Entrepreneur was surprised by the tradeoff between throughput or productivity and customer lead time and the role of randomness introduced by queueing in creating this tradeoff.
We pick up the story here. Read Part I of The Entrepreneur and The Queueing Theorist if you are just joining.
Understanding Queueing
“It’s straightforward to understand queueing if you think of it as orders backing up in front of your services as customers place their orders simultaneously. Your developers can only work on a small number of orders simultaneously, and if new orders come in when they are busy with other orders, the new order will have to queue up and wait.” said the Advisor.
“Very often, there is no queueing if orders are spaced well enough apart in time. At other times, queues build up for a little while and get drawn down as your developers catch up.”
“This is precisely the same thing that happens when you are in line at the supermarket or the security line at the airport. Depending on how long the queueing lasts, the time it takes a customer to fulfill their order will vary quite a bit, even if the actual development cycle time needed for each order is very stable.”
“In particular, if orders consistently arrive at a rate that your process cannot service, we have a situation where queueing happens continuously. When this happens, there is no limit to how long customers might have to wait for their orders. We never want to be in this situation.”
“Queueing introduces randomness into even deterministic processes; thus, we need probabilistic tools to reason about this behavior. Luckily, queueing theorists have been studying this problem for over a century, and we have powerful mathematical tools called queueing models that will allow us to analyze queueing in stochastic processes and reason about its causes.”
The Queueing Model
“We’ll construct a probabilistic model of how your customers interact with your service, which will clearly explain how queueing is causing you to lose money on the one-day guarantee.” the Advisor says.
“This model, like all queueing models, consists of
The Arrival Distribution A: A probability distribution that describes how requests arrive at the service over time - the shape of the demand for the service.
The Service Distribution S: A probability distribution that describes how long it takes your service to fulfill requests and
The Service Concurrency c: The maximum number of requests the service can process simultaneously.
Note that all three components relate to time: how often requests arrive, how long they take to process, and how many the service can process simultaneously.
Queueing models explain how delays emerge in a system, and these three components are the basic building blocks you need to reason about this.”
“Let’s look at what these components look like for your system,” she says.
The Arrival Distribution
“The arrival time distribution shows the probability of the time between consecutive arrivals to your service. The intuition here is that queueing occurs when many orders arrive in a short window of time, and so we want to capture the probability that inter-arrival times are short vs more spread out in time.”
“It’s important to note that these are long-term probabilities. We analyzed several months of order data to understand arrival patterns and built this probability distribution from that analysis. Since your business is growing fast, you must keep an eye on these probabilities since they will change and have a material impact on your business as it grows.”
“We can see that currently, the arrivals are skewed on the high-frequency end, which means that usually, many customer orders arrive close to each other. The average inter-arrival time is 0.41 days, slightly smaller than the 0.5 days it takes to service an order, so we should expect to see some queueing under these circumstances.”
The Service Distribution
“The service distribution shows your process’s probability distribution of service times. This is the time it takes your team to fulfill an order once they start working on it.”
“Your process is highly optimized, and most orders are processed in around 0.5 days with minimal variability. Our model will approximate this with a deterministic (constant time) distribution to illustrate the impact of queueing much more clearly. In reality, this service has some variability, so this is a conservative assumption. The queueing impact will only worsen once we account for this.”
The Service Concurrency
“Finally, we’ll model the service concurrency c=2 since you can process at most two orders concurrently with your current team.
For now, we’ll assume there is no multi-tasking and that the team takes an order from start to finish before starting a new one. Your work-in-progress data seems to bear this out, and again, this is a conservative assumption that will highlight the impact of queueing.”
The Probability of Queueing
“The first thing to assess is the probability that queues will form in your system, given the three parameters above. We do this by looking at probabilities for the total number of items in the system, both in progress and waiting for service.”
“We noted earlier that the ratio of the arrival rate to the service rate, which we defined as utilization, is an important driver of how the system behaves. The queueing probabilities are shown here for utilization of 60%, which is where your system is currently.”
“We see that the most likely scenario is that the number of items in the system is less than the concurrency limit, which implies that there is no queuing in those cases. Again, this agrees with our intuition since arrival rates are well below service rates.”
However, queueing will occur whenever the number of orders in the system exceeds the concurrency limit. We can calculate the probability of this happening as the area under this curve to the left of the solid red line in the distribution, and this number is significant at 41%.”
“We see that up to 5 orders can be queued up in the system at a time, waiting for service. These are the potential candidates for higher customer lead times. The critical question is how it impacts the probability of the customer lead time exceeding a day - the ones that trigger your losses on the one-day guarantee!”
“So let’s put everything together and look at how long your customers typically wait for their orders to be fulfilled at the current utilization. Remember, this was the relationship you found surprising earlier!”
The Customer Experience
“This is captured by the Customer Lead Time probability distribution, one of the outcome parameters for your process. “
“The distribution shows that the impact of queueing on customer lead times is not too significant (at least, not yet).
The most probable lead time that a customer will experience is 0.51 days, and the average lead time is 0.6 days, which aligns with your lead time flow metrics. But as you can see, higher lead times are also probable for some customers, and overall, this lead time distribution is much more variable compared to the very deterministic service time distribution you have for the internal process.
The difference between the two distributions is entirely attributable to queueing. It is the most significant factor determining the end customer experience. And to a large extent, you cannot control the impact directly when you don’t control the demand.
In particular, there is a non-zero probability that up to 7% of your customers will wait more than a day to fulfill their orders and will, therefore, get their money back.
Think of this 7% loss rate as a queueing tax. Based on your current capacity, it is unavoidable. But think of this as a way of keeping you honest in a competitive market.
At least your customers will not be as unhappy with the small slips in your SLAs, and you now have a very tangible economic reason to optimize your end-to-end customer experience instead of just focusing on optimizing your internal process.
The primary lesson here is that queueing will always amplify your service times, and your customers will always experience these amplified service times.
To see how critical this is, let’s look at what these amplified service times will look like if your order volume goes up by 20%. Since your service rate is what it is today, this amounts to seeing what happens when utilization of your process goes up from where it is today.”
“As you can see here, the probability of customer lead time exceeding a day when demand for your process goes up from here goes up dramatically: from 7% to over 20%! “
Now we are talking real money that will quickly make your business unviable if you don’t get it under control early!”
“That’s sobering news!” says the Entrepreneur.
“How do we get ahead of this?”
“Let’s examine some options,” says the Advisor.
In the next episode, the Advisor examines several options and the tradeoffs they involve so that the Entrepreneur can stay ahead of the looming disaster that might come with more growth in her business as the money-back guarantee backfires in a big way!
Check my response to this at: https://www.linkedin.com/posts/nitinuchil_the-entrepreneur-and-the-queueing-theorist-activity-7248470434680037377-Wegt?utm_source=share&utm_medium=member_desktop where Judea Pearl's Ladder of Causality could probably provide the basis for better understanding queueing theory.