TestWorks Quality Index TM
Application Note

© Copyright 1999-2006 by Software Research, Inc.

The TestWorks Quality Index:
A Quantitative Quality Index for Your Application Development Process

OVERVIEW

Assessing the relative quality of a software system is a complex but important matter in software engineering. To make reasoned decisions about complex software requires an approach that combines analysis of product properties with analysis of the underlying software construction process. A weighted figure of merit software quality index -- the TestWorks Quality Index described here -- offers an attractive approach to doing this because it takes into account software quality metrics, process assessments, and other practical considerations.


TestWorks Quality Index (TOP) Assess the quality of your quality process with the TestWorks Quality Index.
Table I -- Software Quality Process Filters Assesses relative advantages and disadvantages of various software quality methods.
Table II -- TestWorks Quality Index Definitions Summarizes the definitions of the factors that make up the TestWorks Quality Index and shows how to compute it for YOUR process.
Table III -- Product Application Profile Gives a recommended composition of use of TestWorks products and indicates likely CMM-like levels and relative overall process efficiencies.

Give us your feedback about the TestWorks Quality Index!
Please send E-mail to us by using the Information Requestion Form


INTRODUCTION

A common problem in software development is:

The Software Quality Index approach to this problem is to assess the quality of a particular application by weighing the answers to questions that address BOTH the properties of the application itself and the characteristics of the process used to produce it.

QUALITY PROCESS ASSESSMENT METHODS

The SEI CMM and the ISO-9000 type quality process models are based on examining the process that produces the product. This approach is based on the well-documented fact that a better industrial process tends to produce a better product, and that continual incremental improvements to that process tend to lead to continual incremental improvements in its product. This simple method can account for spectacular quality gains.

While this technique is clearly valid in general terms, sometimes good processes produce bad products and bad processes produce good products. This happens annoyingly often in software products, perhaps because some of the intermediate elements of the process can be very difficult to measure.

Quality Process Models accept such exceptions, focusing on the main point: Improving the process improves the product. And the exceptions are anomalies.

PRODUCT METHODS

The Product Analysis approach, often called the metrics approach or the static analysis approach, takes the opposite tack: look at the final product only, and base decisions about its quality on what is actually there, regardless of how it got there. After all, the final source code itself completely determines what an application can do. Regardless of how it was produced, regardless of the methodology or tools or process used to make it, the actual quality of a software product is determined directly by its own internal, intrinsic properties.

Even if it is junky, spaghetti code hacked together by rank amateurs, if it works well then it works well. Who needs a fancy software process, anyway? Simply put, quality is determined in the contest of the marketplace. Of course, given that quality is implicit in the as-built product, we still have to find a way to measure it if we want some measure of control over the result. To measure the quality of an application by its structural properties or content, we use software metrics (e.g., cyclomatic complexity, size metrics -- there are hundreds of possible metrics).

The paradox is that just having measurable high-quality code that meets small-scale and large-scale quality guidelines is no more a measure of field product success than having a perfect manufacturing process.

PROCESS/APPLICATION ASSESSMENTS

A combined process/application assessment brings together the strengths of each of these approaches. A software quality manager needs to take into account the following factors: HOW a product was built; WHAT its characteristics are; and WHY better quality is important; and what the producers and their management -- the team -- FEEL about how good the team/product combination is.

A multi-faceted assessment method can be fooled too, of course, but its strength is that it focuses on perceived quality-key aspects of both process assessment and product assessment.

There are plenty of available technical alternatives. Some of them are shown in the accompanying

Table I --Software Quality Process Filters

which shows a range of possible software quality filters and also indicates how they can be applied, what some of their limitations and advantages are, and where the payoffs -- if any -- lie with each method or approach.

THE METHODOLOGY

The TestWorks Quality Index is a balanced, weighted, experience-determined estimate of selected factors and uses a combination of estimates, measurements, and process-characterizations to come up with a quality figure that can be used to compare products.

The TestWorks Quality Index value is the average score obtained on a simple question list, where specific quantitative responses based on current engineering experience assign "points". The more points scored, the better the product.

In engineering this kind of calculation is usually called a "Figure of Merit (FOM)" and the notion of using FOMs has a long tradition of use in comparing complex things. From assessing competitive proposals (which are scored according to weighted averages), to determining plant efficiency, engineers take the practical approach even when it is known there is no theoretical solution.

Some benefits the TestWorks Quality Index offers in assessing your products are:

HOW THE TestWorks QUALITY INDEX WORKS

The TestWorks Quality Index works as shown on the following chart. The factors on the chart are metrics that you can measure, or are assessments you can make, in a straightforward way. Detailed explanations of the terms follow the chart.

As you read the explanations, think of a specific project that you're working on, and try to calculate its TestWorks Quality Index score as you go along.

TestWorks QUALITY INDEX CRITERIA

Not just any list of scored questions qualifies as valid comparative index. To qualify as an effective indicator some constraints have to be put on the TestWorks Quality Index (or its in-place equivalent) to make sure that it isn't manipulated to favor a particular process or product feature or quality assurance approach.

Constraints that make sense are the following:

The idea here is to constrain the ways the FOM is computed so you are forced to include certain kinds of factors that will assure that the FOM really is meaningful.

DETAILED EXPLANATION OF TERMS

Here are short explanations of the above indicated measures. The inclusion keys [D, S, T, P, $] were explained above. Note that no factor can be included in the matrix unless at least one of these inclusion keys is addressed.

FnShort
Definition
Meets
Criteria
EXPLANATION
F1 Cumulative C1 (Branch Coverage) Value for All Tests D, S, Q This is the total C1 value achieved for this product on all tests, e.g. as measured by TCAT. Note that statement coverage is NOT usable because it understates results by half or more. Statement coverage (C0) is not acceptable.
F2 Cumulative S1 (Callpair Coverage) Value for All Tests D, S, Q This is the total C1 value achieved for this product on all tests, e.g. as measured by TCAT. Note that we are counting the connects between caller and callee, not just whether a function was ever called (which is called module testing). Module testing coverage (S0) is not acceptable.
F3 Percent of Functions with E(n) < 20 S, Q This measures the structural complexity for all functions or modules or methods in the current application. Experience shows that the cyclomatic complexity E(n) = E - N + 2 > 20 implies a "too complex function" -- not necessarily bad, but a potentially troublesome problem if a high percentage of the individual functions have this value. Though not necessarily harmful, too high a percentage of "too complex" functions can be a serious warning sign of trouble ahead.
F4 Percent of Functions with Clean Static Analysis S, Q Static analysis finds a broad class of defects that may cause trouble in the future. Many errors found by static analysis are non-critical, but too many static analysis detections is an indicator of poor quality. The measurement made here requires that a certain stated percentage of functions be subjected to some form of static analysis.
F5 Last PASS / FAIL Percentage D, P, Q This is the total number of tests that PASS vs. the total number of tests available, as would be measured by the test controller, e.g. SMARTS. Tests PASS if they run as expected, and produce output close enough (as determined by the programmable differencer) to the baseline to be acceptable.
F6 Total Number of Test Cases / KLOC T, Q This is a measure of the degree to which you have thoroughly tested the software relative to its size measured in 1000's of lines of code (KLOC). Most software is very poorly tested, i.e. with very few test cases, so it may not take a great many tests to score a high value on this measure.
F7 Call Tree Aspect Ratio S, Q This is a measure of the "verticalness" of the call-tree of the package, with packages that have a less vertical structure (and thus are more independently testable) viewed as superior. The vertical height is the maximum depth calling tree (this is shown in Xcalltree/WINcalltree), and the horizontal width is the largest number of functions on any level in the tree. If the tree has multiple roots (as it will likely have in most modern applications) then the average values for all possible roots is taken.
F8 Current Number of OPEN Defects / KLOC T, J No software product/project is perfect; this metric indicates how many defects per KLOC are open that are critical. An open defect typically means it is reported, reproduced, but unresolved with no work-around available to the user.
F9 Path Coverage Performed for Some Percentage of Functions P, J For almost all packages some critical functions or modules require full path coverage, but not all. This measures the percentage of all functions for which some form of path coverage has been performed. Note that path coverage is NOT the same as branch coverage.
F10 Cost Impact / Defect $, J This is an indication of how critical a serious software defect might be, expressed in monetary terms, i.e. in terms of the direct cost of any defect. Note that the scale used tends to take points away for the most-critical kinds of projects; this is done so that the more critical projects receive the greatest attention.

Note: Any product which is "life critical" gets 0 points. This artifact has the effect of forcing "life critical" situations to gain TestWorks Quality Index points by increasing the requirements on all of the other factors.

All of these factors are summarized in

Table II -- TestWorks Quality Index Definitions

which summarizes the definitions of the factors that make up the TestWorks Quality Index and shows how to compute it for your process.

ILLUSTRATIVE EXAMPLES

Here's how the TestWorks Quality Index works when applied to some example projects.

CONNECTING THE INDEX TO REALITY

The hard part comes when trying to connect with reality. The main question everyone asks is, "How reliable will my application be in the field?"

As students of software quality know very well, this is a very deep question to which there are few definitive or even suggestive answers. Instead, about the best we can do is associate a particular process's TestWorks Quality Index score with a likely estimate of reliability based on judgment and experience.

An initial experimental estimate of this is done in the attached

Table III --Product Application Profile

which gives a recommended composition of use of TestWorks products and indicates likely CMM levels and relative overall process efficiencies.

Time will tell whether the numbers are too high or too low. Time will tell if the reliability values correspond to the SEI/CMM levels, or if the achieved reliability is too low or too high.

And, time will tell whether that application of relatively simple quality filters will achieve, or won't achieve, the expected effect often enough to be relied upon.

But in any case, making the attempt to tie these essential ingredients together is totally essential.