In Software Engineering, How Good is Good Enough?

Categories: AI and MLTechnology

A recurring challenge in software — especially for newly developed systems or major upgrades — is knowing when you’re ready to deploy or ship for the first time in production. When is “good enough” actually “good enough?”

In this article, we’ll explore this concept of good enough and the nuance around it. You’ll learn key differences between internal and external success, how ship decisions are made by various stakeholders, and where teams and individuals often disagree on precisely what makes a product or update “good enough” and ready to go. Finally, we’ll work through different approaches using real-world use cases and see what lessons can be learned from those.

Part 1: Internal vs. External success

Perceptions of risk depend heavily on company culture, with fear of failure (individually or corporately) being a major driving factor. Most companies agree that a reputation for secure, high-quality products is important to achieve market and business success. 

What differs is how success is measured: internally or externally. 

Companies with an internal focus generally use key performance indicators (KPIs) and other internally-defined metrics to provide incentives or disincentives to drive desired employee and departmental behaviors. These metrics are proxies for desired outcomes; for example, “minimize the number of security penetration vulnerabilities detected post-ship.” 

Causes and risks of an internal focus

Many large companies, in particular, see proxy metrics as the only way to effectively communicate goals throughout a complex organization, to achieve alignment, and to measure success business function-by-function or department-by-department.

Unfortunately, this focus on modeling desired outcomes using proxy metrics often results in the individuals within the company seeking to maximize (or minimize) the metrics that apply to them, rather than focusing on the overall success of a given product in the marketplace. 

No one in this system has the wrong intent. Who could argue that it’s good to ship with more security vulnerabilities? By seeking to eliminate security vulnerabilities, the security group (for example) is merely seeking to do its job. With an internal focus, they will see their job as ensuring compliance to the policies the team has established to achieve the goals set by upper management. 

However, a department-by-department optimization approach can lead to a bad outcome for the company. It is impossible — or at least economically infeasible in a finite time — to create a product of any kind, software or hardware, that is literally perfect in every dimension. And even in those areas where the product is “perfect,” time:

  • exposes new patterns of usage
  • introduces new vulnerabilities in underlying infrastructure or integrated systems
  • results in changes in the operating environment
  • spreads new usability paradigms
  • and introduces other factors that will make even an initially ‘perfect’ product imperfect. 

Successful products are a balance of features, time to market, and non-functional requirements like usability, security, quality, and many others. Which aspect of a product should be emphasized at a given time to achieve a given business outcome requires balance across multiple factors that are sometimes — in fact, often — in conflict. 

In an internally-focused company, ship decisions are often made by executives based on the input from the various business functions and departments. Each of those stakeholders is trying to follow corporate direction by maximizing (or minimizing) the appropriate metrics. 

This can result in some groups having an incentive to avoid shipping a product at all, in order to minimize their department’s or their own exposure to a potentially negative impact on their group’s KPIs. And sometimes, not shipping is the right thing to do. 

On the other hand, in an internally-driven culture, a product that might otherwise be highly successful in the market might never be seen as “good enough” by all its internal stakeholders because shipping it poses a risk to a particular group’s KPIs or other departmental incentives. No one is doing anything ‘wrong’ or ‘dishonest’ in this scenario;in fact, the groups are responding to the direction they have been given by upper management in the form of these metrics. 

However, the net effect is that such companies tend to be highly conservative. They tend not to ship innovative products because they pose a risk of failure to meet internal metrics for one group or another.

How externally-focused organizations differ

Externally-focused companies, on the other hand, are market-focused and look for their signs of success or failure from their buyers, customers and end users. They, too, are sensitive to the risk of failure, but it’s commercial failure in the marketplace that keeps them up at night. 

No single group can afford to place its individual departmental success or mission above the goal of shipping a successful product. This is not out of altruism, or because the individuals involved have different skill sets, quality standards or personalities than their counterparts in an internally-focused company. Rather, it’s because of the way success is measured. 

Recommended reading: Culture at the Core: A Playbook for Digital Transformation

In an externally-focused company, the whole team — all departments — succeed or fail based solely on the success of the product in the market. No department is viewed as successful, even if it hits its individual metrics, if the company or the product fails. Market success is a shared goal, and all teams in an externally-focused company have no choice but to work together to achieve it.

That doesn’t mean that teams within a market-focused company always agree with each other. Far from it. 

Product Management and Engineering may disagree on dates and feature sets; the quality and security groups may raise red flags that are heeded or ignored; DevOps and Ops may fight with the business over tool choices and FinOps. What is different in an externally-focused company, though, is that all of these disagreements are in pursuit of a common goal rather than a departmental goal. 

That shared goal is to quickly ship a featureful-enough, secure-enough, sufficiently high-enough-quality product that succeeds in the marketplace. In this context, maximizing the metrics of an individual group is irrelevant if the product fails, because all lose. 

When all stakeholders are focused on a common goal, disagreements tend to be healthy and rapidly resolved because all are in pursuit of a common goal: product success.

Small companies, and startups in particular, tend to be externally, market-focused. The major incentive and KPI at a startup is stock options, and these only become valuable if the product and company succeeds in the marketplace.

Differences in focus between small and large companies 

Small companies tend to have only a small number of products — sometimes just one. If the product and company do NOT succeed in the marketplace, then a small company will likely close down, making the stock options worthless. Despite the market risk, these factors and others align to make startups externally focused (which is part of their attraction to many good engineers and investors).

Large companies may be or become externally focused, as well. Apple is a very good example. Before Steve Jobs rejoined as interim CEO in the late 1990s, Apple had become very much an internally-focused company. Jobs’ and Apple’s tremendous achievement was due in large part to Steve’s success at flipping Apple’s focus from internal to external. 

He did that by getting a critical mass of people focused on their product’s success with consumers, and in the marketplace. Jobs did this while maintaining what most of us would agree were high quality and security standards. “Externally focused” does not mean “sloppy.” 

Externally focused does mean “risk-based,” though. As a market-focused company, it does not make sense to put effort into areas that fail to generate business value. 

This means that instead of absolute, unalterable, KPI- or departmentally-driven standards for each individual aspect of a product, an externally-focused company instead looks at how best to maximize the business value of the entire product for each release. That is the essence of creating a product that is “good-enough,” “secure-enough,” “featureful-enough,” and so on. This does not mean ‘sloppy,’ but it does mean a ruthless focus on creating release-by-release business value.

Now, how do you quantify “good enough” when making risk-based ship decisions?

Part 2: Approaches to “Good Enough”

With this understanding of how market-focused companies make risk-based ship decisions based on maximizing the business value of each release, we can begin to explore different ways to quantify “how good is good enough.”

Achieving external market-based success often requires a balancing act. In general, you want to get a new product in front of real, live customers as soon as possible. This not only delivers value to your users, but also allows you to gather their feedback so you can improve the product (“pivot or persevere” in Lean/Agile terminology). Success here also lets you unlock potential revenue streams from investors, internally from your own company, and from customers themselves. 

On the other hand, if your product is not “good enough” when customers first see it, you may lose them irreparably and fail in a different way to unlock your revenue stream. In other words, there are risks in shipping, and also risks in not shipping. 

How do externally-focused companies make the tradeoff?

I started studying this issue back in the 1990s. At that time, I read a book that described Hewlett Packard’s quality criteria for at least some of their products at the time (this is pre-split, before HP became “HP” and “HPE”).  

The book said that HP’s quality criteria was met for a given product when it became more expensive to find the next critical bug internally through testing than the cost of letting the customers find that critical bug post-ship. In other words, they would ship when it was more costly to find the bug through testing than to let the customers find it, reputation and remediation costs included.

This was a very high bar in the ‘90s. In those pre-cloud days, systems were deployed on premises, on customer equipment. For at least some systems, replacing them in the field required flying technicians to multiple points of the globe to physically install from media, apply configuration options, transition data, and perform other operations. Replacing a critical system in the field was expensive, so this quality bar was very high. Today, of course, we can deploy in production automatically multiple times a day, so it might not seem like a big deal. But it was then.

Future events are always probabilistic; we don’t know for sure how much testing will be required to find the next critical bug, or how much customer use will uncover it. We can use tools like the defect discovery rate and projections based on past releases to put numbers on these events. But in the end, we are balancing one uncertainty or “risk” against another. This 1990s era HP criteria was therefore “risk based;” that is, they shipped when the probabilities dictated that it was likely that it had become more expensive to continue testing, than the likely cost of finding and remediating a serious quality issue found after shipping. 

I followed HP’s example with a number of products in that timeframe, and for many of them I was accused of being too conservative as the customer never did find that next critical bug. Still, we were playing the odds and very conscious of it.

HP’s approach appealed to me because it made sense, business-wise and quality-wise. You ship when it’s more costly not to ship. 

The potential cost of shipping could be defined so as to include all the factors that keep people up at night: security incidents, production-down incidents, reputational damage — even loss of life. These would be weighted by their probability of occurrence. The cost of not shipping, on the other hand, would include:

  • loss of revenue, 
  • loss of opportunity and competitive advantage, 
  • the cost of continued development and testing, 
  • development infrastructure costs, 
  • and everything else that goes into keeping a product under on-going development without a supporting revenue stream.

Steve Jobs is quoted as saying to the original Apple MacIntosh team, “Real artists ship.” What he meant by this, I believe, is that software systems (or anything else) must be put in the hands of end users to be valuable to anyone. A system delivers no business value while it is under development. 

A system that never ships, because its developers or other stakeholders believe it’s imperfect, delivers no value at all. And, of course, no system is ever perfect. Even if it did start out being perfect, a real-life system is unlikely to remain perfect given emerging security threats, changes to the underlying software components a given system depends on, and many other factors outside the boundaries of the system itself.

Recommended reading: If You Build Products, You Should Be Using Digital Twins

Years ago, I was lucky enough to work with some PhD-level experts on economically-based business decision making. One then-widely accepted approach to such decision making was probabilistic and risk-based. I was surprised and a little shocked to learn that this often included putting a dollar value on lives lost, in order to calculate potential risk. 

For example, if an airline company wanted to determine whether a potentially life-saving improvement to one of their jets was worth the investment, they would go through a thought process something like this simplified illustrative example using fictitious numbers:

  • Replacement cost of the plane is $100M.
  • The plane carries 200 passengers.
  • Our expected liability for each passenger death is $10M USD (this was in the early 2000s, so today this figure would be closer to $20M USD per passenger).
  • The probability of a fatal crash over the 150,000 flight-hours expected service life of the airplane is 1.78%.

Using these made-up figures, we see that the “expected loss” (probability of loss times the $2.1B potential amount of loss) is $37M USD over the life of the aircraft. 

In other words, if you were going to set aside a reserve to cover future losses from the crash liability for “N” number of airplanes of the same type, you should set aside “N x $37M USD.” If the lifetime probability of loss could be reduced to zero through an investment of $37M per plane or less, then it would be economically justified to spend it on improving the safety of the aircraft. 

If it costs more to reduce the risk to zero, or in general if the amount at risk would be reduced by less than the amount invested, the additional investment to improve safety would not be justified in purely economic terms.

This is a very simplified analysis and perhaps morally repugnant because human life is involved. How can one put a financial value — even double the original value, or $20M in today’s dollars — on an individual human life? 

Yet the alternative is to never ship. If we had to reduce the risk of a fatal airplane crash to zero, no one would ever be able to fly. 

The cost of creating such a plane would so far exceed the potential losses from the current generation of planes that no company (or group of companies, or (probably) nations) would ever be able to make the investment. And even if it were possible to build a perfect plane, external factors like deliberate malice, meteor strikes or other non-engineering factors would mean that some people will still die. The cost of the added investment in safety would also have to be passed along to the consumer, raising the price of travel for the end user — perhaps raising it more than the decreased risk would be worth even to the traveler themselves. 

It’s heartless, but if one wants the benefit, one needs to take the risk. The art is making the risk of loss smaller than the expected reward, to the company and to the customers, who follow their own risk/reward calculus.

We each make risk-based decisions every day, yet we hardly think about them. We risk our lives, to a greater or lesser extent, every time we commute to work. For those who drive in the U.S., the cumulative lifetime risk of dying in a car accident is about 1 in 93. Yet we continue to drive and take other risks, because we believe the benefit outweighs the risk. We also put the lives of our loved ones at risk each time they accompany us in a car, ride a bike, or even go for a walk. We rarely worry about it or think about it, but at some level our brains do the math and decide to take a risk to get a reward. 

Even following a risk/reward model, we can still set the quality bar as high as we like by making the expected cost parameters as high as we choose. Instead of $10M liability per life, we can increase it to $100M or $1B per life, for example. We can continue to invest until an arbitrarily high degree of perfection is achieved — or until we run out of money. However, in the meantime, both the customer and the company are denied the benefits they would get through real-life use of the product. This is also a very real cost. 

We can safely assume you believe your product will deliver value to its users: If you didn’t think so, why would you build it in the first place? If that is your belief, then withholding your product from the market hurts its potential users by denying them the benefits that using your product would bring. While you obviously don’t want to ship them something bad or that doesn’t work, bringing something to market that delivers positive benefits to its users is valuable to them, and therefore worthwhile. 

Considering only the potential risks associated with shipping a product, without weighing the potential benefits internally and externally, is not a sound or a balanced ‘engineering’ approach to ship decisions. Weighing both costs and benefits when making a risk-based ship decision is not a compromise. It’s the essence of a value-based business and engineering approach.

Learn more:

  • URL copied!