The year is 2034, and the AI revolution continues unabated.
Silicon Valley has relocated to cryopods in near-earth orbit as the utopians await the Singularity.
It is now expected in 2040.
In the meantime, AI has helped solve many mundane problems of the early 21st century.
Rapid software product development is considered a solved problem. Small teams of experienced domain experts and skilled software engineers manage AI coding farms, delivering software at previously unimaginable speeds.
One-day delivery of new features and change requests to complex custom applications is routine.
Most businesses outsource application development to specialized companies with the technical expertise and capital needed to build and deliver software with AI-optimized infrastructure.
The Entrepreneur
Our protagonist is the CEO of one of these innovative custom software factories1, which are the backbone of the software development industry today.
Her startup has developed an AI-optimized development process, and her team of engineers and domain experts can take a custom order from idea to production in half a business day.
The company pitch is “Your idea - live in production in one business day, or we’ll give you your money back.”
The team is still tiny: two domain experts and two engineers who manage their AI coding farm, which does much of the heavy lifting.
Business has been growing fast.
But so have losses on the one-day guarantee…
The Weekly Business Review
The Entrepreneur is unhappy.
“The losses on the one-day guarantee are starting to hurt us,” she tells the Lead Engineer.
“We designed our process to deliver four daily orders with two engineers. Our throughput is below target, and our work in progress is below what it needs to be to meet this target!”
“Our engineers need to be more productive. We need to increase our velocity!”
The Lead Engineer looks at the same dashboard and says, “Look, our engineering cycle time and customer lead times are exactly where we designed our systems to be. ”
“We have a customer lead time SLA of 1 business day. Our current development cycle time is 0.54 days, and our customer lead time is 0.64 days on average. That’s well within our SLA.”
“You do realize this is not a factory making the same widgets repeatedly, right?”
“Every order and change request is different. Our engineering skills in mapping customer orders to the right prompts for the AI, correcting for hallucinations2, and keeping all our delivery pipelines green through all this are what make this operation work!”
“Our team already feels like they can barely keep up with the work coming in!”
“We designed our process around maintaining predictable development cycle times and customer lead times, and we are meeting these commitments. Why are we focusing on velocity and productivity instead of customer response time?”
The Entrepreneur is puzzled but agrees the Lead Engineer has a valid point.
“But why do we often fail to meet our one-day customer guarantee if we are within the SLA?” she asks.
The Lead Engineer does not have an answer.
“Maybe something is wrong with the data?” he asks.
They are at an impasse, so they engage an Advisor to help resolve the question.
The Queueing Theorist
“I need you to help us understand why we are losing money on the one-day guarantee when all our operational flow metrics seem to say we are doing fine,” the Entrepreneur says to the Advisor.
The Advisor spends time with the team to understand how the operation works, examines how they measure their processes, and requests some additional information.
“There is an important metric you are not currently tracking, which should help us get to the bottom of what is happening here.”
“I’ll need to understand how and when your customer orders are coming into your process,” she says.
She reviews some data from the company order management system and builds a model of the customer experience to help the team understand what is going on.
It does not take long to resolve the first puzzle that concerns the Entrepreneur the most: is velocity/productivity good enough?
“The first thing to note is that none of your metrics are looking at the inputs to your process,” the Advisor says to the Entrepreneur.
“Your flow metrics will vary depending on how heavily your process is being loaded concurrently by customer demand, so the first thing to check is to see what that demand looks like.”
“A quick analysis shows that you receive an average of 2.4 orders daily, which, not surprisingly, matches your throughput of 2.4 orders daily.”
“So, we can confidently say that your velocity/productivity is not your problem. You can’t ship orders any faster than they are coming in!”
“But let’s look at the underlying principle that lets me make this claim.”
“Your service rate - the rate at which you can service orders with two engineers when each order takes half a day, is four orders per day, and the variability in your development cycle time is very low. Your arrival rate, at 2.4 daily orders, is well below this rate. We can confirm this from your current cycle time data.”
The critical metric I mentioned that you should pay close attention to is utilization—the ratio of these two numbers.
This is 0.6 (2.4/4), and as long as this number is less than 1, your throughput will always match your arrival rate because you have enough capacity to keep up with the demand.”
“However, talking to your engineers, they already feel overwhelmed with the customer demand today, and you are starting to see some customers experience larger lead times.”
“This is not surprising at these utilization levels, and, as we’ll see, this is also the underlying reason for your losses on the one-day guarantee.”
The Entrepreneur says, “I don’t understand. What does utilization have to do with customer lead time?”
“If we want to be efficient, shouldn’t we maximize our utilization and ensure we are getting the most productivity from our engineers? They are our biggest expense after AI costs. They are also very hard to find these days.”
“No, unfortunately.” says the Advisor.
“Your business model relies on high throughput and engineering productivity, which is higher when running at higher utilizations. However, you can only go so far with that strategy since you also have money riding on short customer lead times because of your money-back guarantee.”
“Customer lead times increase as you increase utilization. So throughput/productivity and customer lead time are always in tension.”
“Please explain how this can be. I’m really curious now! It seems I am missing some fundamental piece of the puzzle here,” says the Entrepreneur.
“Yes, this is a common confusion. To understand why, we need to understand the impact of queueing in your system.” says the Advisor.
Queueing
“You’ve designed a highly optimized delivery process, and looking at the data, each custom order does indeed take around 0.5 days to go from start to finish. There is minimal variability here, and it’s a tribute to the technology and process capabilities you’ve built around it!”
“However, your business model has not accounted for the variability in how and when customer orders reach this process, and here we run into the fundamental constraint of concurrency.”
“If a customer order arrives when your team is busy processing orders, then that order has to wait until the team can work on it. This wait time becomes significant if many orders arrive within a short window. This is what we call queueing.”
“In your case, the impact of queueing is reflected directly in your customer lead time. The random wait times due to queueing, added to your stable development cycle times, introduce significant variability into your customer lead times. This is the crucial factor you have overlooked in your business model and the flow metrics you are currently tracking.”
“Some fraction of your customers will experience lead times of well over a day, even though each order was processed in a half day or less internally. These are the orders that are costing you money.”
“How do we fix this?” the Entrepreneur asks.
“You can’t fix this,” says the Advisor.
“Queueing is a phenomenon that emerges whenever you have concurrency constraints in your process. It’s often not easy to tell where and when this might emerge. It introduces an element of randomness into even the most deterministic process, especially when you don’t have control over demand.”
“We can never eliminate the probability of queueing emerging in a process with concurrency constraints. However, recognizing that you are dealing with a random process instead of a deterministic one gives you powerful tools to manage the impact of queueing much more strategically.”
“The important thing is determining the probability that queueing behavior will cause meaningful impacts on your business. In this case, it is clear this is happening, so the question is how bad the impact is and what your risks of this getting worse as your business grows are.”
“How do we do this?” asks the Entrepreneur.
“We’ll start by constructing a causal model for your process, something that will let you connect the output metrics you are currently tracking, like throughput, cycle time, and customer lead times, to the controllable inputs that cause the metrics to be the way they are.”
“We’ve seen many elements of this causal model already in the high-level analysis of productivity and throughput I gave you earlier. But there is so much more we can do once we build one, so let’s understand what one looks like for your business.”
“The causal model for your process is relatively simple. I’ll show you how we can use it not only to understand your current risks but also to understand what risks you face as your business grows and how to manage those risks strategically.”
“I can’t wait to dive into this!” says the Entrepreneur.
In Part II, The Advisor will show the Entrepreneur how she can use a queueing model to manage key process-related risks strategically for her business.
Stay tuned.
A largely uncontroversial concept in 2034.
Still a problem in 2034.
Great story on queueing theory. As we enter the realms of custom mass manufacturing, this is a key lesson for us.