19Jan2022

Correctness proof in software quality management

The relationship among various attributes of resources and software products can be suggested by a case study or survey. For example, a survey of completed projects can reveal that a software written in a particular language has fewer faults than a software written in other languages. Understanding and verifying these relationships is essential to the success of any future projects. Each of these relationships can be expressed as a hypothesis and a formal experiment can be designed to test the degree to which the relationships hold.

Usually, the value of one particular attribute is observed by keeping other attributes constant or under control. Models are usually used to predict the outcome of an activity or to guide the use of a method or tool.

It presents a particularly difficult problem when designing an experiment or case study, because their predictions often affect the outcome. The project managers often turn the predictions into targets for completion. This effect is common when the cost and schedule models are used. Some models such as reliability models do not influence the outcome, since reliability measured as mean time to failure cannot be evaluated until the software is ready for use in the field.

There are many software measures to capture the value of an attribute. Hence, a study must be conducted to test whether a given measure reflects the changes in the attribute it is supposed to capture. Validation is performed by correlating one measure with another. A second measure which is also a direct and valid measure of the affecting factor should be used to validate.

Such measures are not always available or easy to measure. Also, the measures used must conform to human notions of the factor being measured. Internal attributes are those that can be measured purely in terms of the process, product, or resources itself.

For example: Size, complexity, dependency among modules. External attributes are those that can be measured only with respect to its relation with the environment. For example: The total number of failures experienced by a user, the length of time it takes to search the database and retrieve information.

Processes are collections of software-related activities. The different external attributes of a process are cost, controllability, effectiveness, quality and stability. Products are not only the items that the management is committed to deliver but also any artifact or document produced during the software life cycle.

The different internal product attributes are size, effort, cost, specification, length, functionality, modularity, reuse, redundancy, and syntactic correctness. Among these size, effort, and cost are relatively easy to measure than the others. The different external product attributes are usability, integrity, efficiency, testability, reusability, portability, and interoperability.

These attributes describe not only the code but also the other documents that support the development effort. These are entities required by a process activity. It can be any input for the software production.

It includes personnel, materials, tools and methods. The different internal attributes for the resources are age, price, size, speed, memory size, temperature, etc. The different external attributes are productivity, experience, quality, usability, reliability, comfort etc. A particular measurement will be useful only if it helps to understand the process or one of its resultant products. The improvement in the process or products can be performed only when the project has clearly defined goals for processes and products.

A clear understanding of goals can be used to generate suggested metrics for a given project in the context of a process maturity framework. Deriving the questions from each goal that must be answered to determine if the goals are being met. To use GQM paradigm, first we express the overall goals of the organization. Then, we generate the questions such that the answers are known so that we can determine whether the goals are being met. Later, analyze each question in terms of what measurement we need in order to answer each question.

Typical goals are expressed in terms of productivity, quality, risk, customer satisfaction, etc. Goals and questions are to be constructed in terms of their audience.

Example : To characterize the product in order to learn it. Example : Examine the defects from the viewpoint of the customer. Example : The customers of this software are those who have no knowledge about the tools.

According to the maturity level of the process given by SEI, the type of measurement and the measurement program will be different. Following are the different measurement programs that can be applied at each of the maturity level.

At this level, the inputs are ill- defined, while the outputs are expected. The transition from input to output is undefined and uncontrolled. For this level of process maturity, baseline measurements are needed to provide a starting point for measuring. At this level, the inputs and outputs of the process, constraints, and resources are identifiable.

A repeatable process can be described by the following diagram. The input measures can be the size and volatility of the requirements. The output may be measured in terms of system size, the resources in terms of staff effort, and the constraints in terms of cost and schedule. At this level, intermediate activities are defined, and their inputs and outputs are known and understood. A simple example of the defined process is described in the following figure. The input to and the output from the intermediate activities can be examined, measured, and assessed.

At this level, the feedback from the early project activities can be used to set priorities for the current activities and later for the project activities.

We can measure the effectiveness of the process activities. The measurement reflects the characteristics of the overall process and of the interaction among and across major activities. At this level, the measures from activities are used to improve the process by removing and adding process activities and changing the process structure dynamically in response to measurement feedback.

Thus, the process change can affect the organization and the project as well as the process. The process will act as sensors and monitors, and we can change the process significantly in response to warning signs. At a given maturity level, we can collect the measurements for that level and all levels below it. Process maturity suggests to measure only what is visible.

Thus, the combination of process maturity with GQM will provide most useful measures. At level 1 , the project is likely to have ill-defined requirements. At this level, the measurement of requirement characteristics is difficult. At level 2 , the requirements are well-defined and the additional information such as the type of each requirement and the number of changes to each type can be collected. At level 3 , intermediate activities are defined with entry and exit criteria for each activity.

The goal and question analysis will be the same, but the metric will vary with maturity. The more mature the process, the richer will be the measurements. The GQM paradigm, in concert with the process maturity, has been used as the basis for several tools that assist managers in designing measurement programs.

GQM helps to understand the need for measuring the attribute, and process maturity suggests whether we are capable of measuring it in a meaningful way.

Together they provide a context for measurement. Measures or measurement systems are used to asses an existing entity by numerically characterizing one or more of its attributes. A measure is valid if it accurately characterizes the attribute it claims to measure.

Validating a software measurement system is the process of ensuring that the measure is a proper numerical characterization of the claimed attribute by showing that the representation condition is satisfied. For validating a measurement system, we need both a formal model that describes entities and a numerical mapping that preserves the attribute that we are measuring. For example, if there are two programs P1 and P2, and we want to concatenate those programs, then we expect that any measure m of length to satisfy that,.

If a program P1 has more length than program P2 , then any measure m should also satisfy,. The length of the program can be measured by counting the lines of code. If this count satisfies the above relationships, we can say that the lines of code are a valid measure of the length. The formal requirement for validating a measure involves demonstrating that it characterizes the stated attribute in the sense of measurement theory.

Prediction systems are used to predict some attribute of a future entity involving a mathematical model with associated prediction procedures. Validating prediction systems in a given environment is the process of establishing the accuracy of the prediction system by empirical means, i. It involves experimentation and hypothesis testing. The degree of accuracy acceptable for validation depends upon whether the prediction system is deterministic or stochastic as well as the person doing the assessment.

Some stochastic prediction systems are more stochastic than others. Examples of stochastic prediction systems are systems such as software cost estimation, effort estimation, schedule estimation, etc.

Hence, to validate a prediction system formally, we must decide how stochastic it is, then compare the performance of the prediction system with known data. Software metrics is a standard of measure that contains many activities which involve some degree of measurement.

It can be classified into three categories: product metrics, process metrics, and project metrics. Product metrics describe the characteristics of the product such as size, complexity, design features, performance, and quality level.

Process metrics can be used to improve software development and maintenance. Examples include the effectiveness of defect removal during development, the pattern of testing defect arrival, and the response time of the fix process.

Project metrics describe the project characteristics and execution. Software measurement is a diverse collection of these activities that range from models predicting software project costs at a specific stage to measures of program structure.

Effort is expressed as a function of one or more variables such as the size of the program, the capability of the developers and the level of reuse. Cost and effort estimation models have been proposed to predict the project cost during early phases in the software life cycle. Productivity can be considered as a function of the value and the cost.

Each can be decomposed into different measurable size, functionality, time, money, etc. Different possible components of a productivity model can be expressed in the following diagram. The quality of any measurement program is clearly dependent on careful data collection. Data collected can be distilled into simple charts and graphs so that the managers can understand the progress and problem of the development. Data collection is also essential for scientific investigation of relationships and trends.

Quality models have been developed for the measurement of quality of the product without which productivity is meaningless. These quality models can be combined with productivity model for measuring the correct productivity.

These models are usually constructed in a tree-like fashion. The upper branches hold important high level quality factors such as reliability and usability. The notion of divide and conquer approach has been implemented as a standard approach to measuring software quality. Most quality models include reliability as a component factor, however, the need to predict and measure reliability has led to a separate specialization in reliability modeling and prediction.

The basic problem in reliability theory is to predict when a system will eventually fail. It includes externally observable system performance characteristics such as response times and completion rates, and the internal working of the system such as the efficiency of algorithms. It is another aspect of quality.

Here we measure the structural attributes of representations of the software, which are available in advance of execution.

Then we try to establish empirically predictive theories to support quality assurance, quality control, and quality prediction. This model can assess many different attributes of development including the use of tools, standard practices and more. It is based on the key practices that every good contractor should be using. For managing the software project, measurement has a vital role. For checking whether the project is on track, users and developers can rely on the measurement-based chart and graph.

The standard set of measurements and reporting methods are especially important when the software is embedded in a product where the customers are not usually well-versed in software terminology.

This depends on the experimental design, proper identification of factors likely to affect the outcome and appropriate measurement of factor attributes.

Software metrics is a standard of measure that contains many activities, which involves some degree of measurement. The success in the software measurement lies in the quality of the data collected and analyzed. Are they correct? Are they accurate? Are they appropriately precise?

Are they consistent? Are they associated with a particular activity or time period? Can they be replicated? Hence, the data should also be possible to replicate easily. For example: Weekly timesheet of the employees in an organization. Collection of data requires human observation and reporting. Managers, system analysts, programmers, testers, and users must record row data on forms. Provide the results of data capture and analysis to the original providers promptly and in a useful form that will assist them in their work.

Once the set of metrics is clear and the set of components to be measured has been identified, devise a scheme for identifying each activity involved in the measurement process.

Data collection planning must begin when project planning begins. Actual data collection takes place during many phases of development. An example of a database structure is shown in the following figure. This database will store the details of different employees working in different departments of an organization. In the above diagram, each box is a table in the database, and the arrow denotes the many-to-one mapping from one table to another. The mappings define the constraints that preserve the logical consistency of the data.

Once the database is designed and populated with data, we can make use of the data manipulation languages to extract the data for analysis. After collecting relevant data, we have to analyze it in an appropriate way.

This consists of building an abstract model of the software that is sufficiently small so that its state space can be made the subject of an exhaustive search. Although this is a promising approach, our problem pops up again in the question of how to build the model of the software. This task is intrinsically difficult and has to be carried out by human experts [ 5 ]. The likelihood of an intentional backdoor in the code showing up in the model is therefore quite similar to the likelihood of it being detected by reviewing the code itself.

Model checking is therefore not a silver bullet to our problem, so we move on to seeing if code review can help. As argued in Chap. In the case of the black box, we are, in principle, left with testing, a situation we discussed in Sect. This leaves us with the testing situations in which the binaries are available.

Code reviews of binaries have many the same strengths and weaknesses as formal methods. On one hand, they hold the promise that, when carried out flawlessly and on a complete binary representation of a piece of software, all hidden malicious code can be found. Unfortunately, there are two major shortcomings: one is that finding all hidden malicious code is generally impossible, as shown in Chap.

Even finding some instances will be extremely challenging if the malicious code has been obfuscated, as described in Chap. The second shortcoming is one of scale. Down et al. Morrison et al. To make things worse, these figures are based on the assumption that it is the source code that is to be analysed, not the binaries; the real costs are therefore likely to be higher.

One way to handle this problem is to perform a code review on only parts of the software product. Which parts are to be selected for scrutiny could be chosen by experts or by a model that helps choose the parts that are more likely to contain malicious code.

Such defect prediction models have been used for some time in software engineering to look for unintended faults and a range of similar models have been proposed for vulnerability prediction.

The value of the vulnerability prediction models is currently under debate [ 23 ]. We do not need to take a stand in that discussion. Rather, we observe that the models try to identify modules that are more likely to contain unintended vulnerabilities.

If the vulnerabilities are intended, however, and we can assume that they are hidden with some intelligence, then a model is not likely to be able to identify the hiding places. Even if such a model existed, it would have to be kept secret to remain effective. It is thus a sad fact that rigorous code reviews and vulnerability detection models also leave ample opportunity for a dishonest equipment producer to include malicious code in its products.

Fuggetta and Di Nitto [ 13 ] took a historical view of what was perceived as the most important trends in software process research in and compared this to what was arguably most important in If these tools and methods are inadequate to help the developers themselves find unintended security holes, they are probably be a far cry from helping people outside of the development team to find intentional and obfuscated security holes left there by a dishonest product developer.

The findings in this chapter confirm the above observation. We have covered the concepts of development processes, quality models, and quality management and have found that they relate too indirectly to the code that runs on the machine to qualify as adequate entry points in the verification of absence from deliberately inserted malware.

The field of software quality metrics in some aspects directly relates to the source code but, unfortunately, it is doubtful that all aspects of security are measurable. In addition, counting the number of bugs found has still not given us bug-free software; thus, it is not likely that counting security flaws will provide any guidance in finding deliberately inserted malware. Methods that go deep into the code running on the machine include testing, code reviews, and formal methods. In the present state of the art, formal methods do not scale to the program sizes we require, code review is error prone and too time-consuming to be watertight, and testing has limited value because of the problem of hitting the right test vector that triggers the malicious behaviour.

Even when triggered, recognizing the malicious behaviour may be highly nontrivial. It is clear that these methods still leave ample space for dishonest equipment providers to insert unwanted functionality without any real danger of being exposed. Although these approaches still fall far short of solving our problem, they remain a viable starting point for research on the topic. In particular, we hope that, at some time in the future, the combination of formal methods, careful binary code review, and testing will increase to a significant level the risk of being exposed if you include malware in your products.

However, we have a long way to go and a great deal of research must be done before that point is reached. The images or other third party material in this book are included in the book's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Skip to main content Skip to sections. This service is more advanced with JavaScript available. Advertisement Hide. Software Quality and Quality Management.

Authors Authors and affiliations Olav Lysne. Open Access. First Online: 20 February Download chapter PDF. Soon, however, it became evident that the engineering of software presented its own very specific challenges. Unintentional errors in the code appeared to be hard to avoid and, when created, equally hard to find and correct. Throughout the decades, a great deal of effort has been invested into finding ways to ensure the quality of developed software.

These investments have provided insights into some key properties that make software quality assurance different from what we find in other engineering disciplines [ 14 ]. Complexity One way of measuring the complexity of a product is to count the number of distinct operating modes in which it is supposed to function.

Visibility Software is invisible. Manufacturing The production phase is very different in software engineering, since it simply consists of making some files available, either through download or through the standardized process of producing storage media with the files on them.

The goal of any software development process is to promote the quality of the software under development. The notion of software quality is, however, highly nontrivial and it has therefore been subject to significant research and standardization efforts.

These efforts have a common starting point in the efforts of McCall et al. Figure Simply, a given software entity can exist in three apparent states of correctness:.

So too, software can be correct, or defective, or neither known to be correct nor defective. The latter state collapses into exactly one of the former when it is evaluated against a specification. In this instalment I'm going to consider the notion that most, perhaps all, software systems are built up from layers of abstraction most of which are in the disconcerting third state of uncertain correctness.

Furthermore, I'm going to argue that software has to be like this, and that's what makes it challenging, fun, and not a little frightening. Note: I'm still not going to discuss the definitions of what a specification is in this instalment. What a tease A perennial debate within and without the software community is whether software development is an engineering discipline, and, if not, why not.

Well, despite plentiful mis use of the term 'software engineer' in my past, I'm increasingly moving over to the camp of those whose opinion is that it is not an engineering discipline. To illustrate why I'm going to draw from three of my favourite branches aspects of science: Newtonian physics, Chaos theory, and Quantum physics, with a modicum of logic thrown in for good measure.

As my career has progressed - both as practitioner programmer, consultant and as imperfect philosopher author, columnist - the issues around software quality have grown in importance to me. The one that confounds and drives me more than all others is what I believe to be the central dichotomy of software system behaviour:. The Unpredictable Exactitude Paradox: Software entities are exact and act precisely as they are programmed to do, yet the behaviour of non-trivial computer systems cannot be precisely understood, predicted, nor relied upon to refrain from exhibiting deleterious behaviour.

Note that I say programmed to do, not designed to do, because a design and its reification into program form are often, perhaps mostly, perhaps always, divergent. Hence the purpose of this column, and, to a large extent, the purposes of our careers.

The issue of the commonly defective transcription of requirements to design to code will have to wait for another time. Consider the behaviour of the computer on which I'm writing this epistle.

Assuming perfect hardware, it's still the case that the sequence of actions - involving processor, memory, disk, network - carried out on this machine during the time I've written this paragraph have never been performed before, and that it is impossible to rely on the consequences of those actions. And that is despite the fact that the software is doing exactly what it's been programmed to do. Essentially, this is because software operates on the certain, strict interpretation of binary states, and there are no natural attenuating mechanisms in software at the scale of these states.

If one Iron atom in a bridge is replaced by, say, a Molybdenum atom, the bridge will not collapse, nor exhibit any measurable difference in its ability to be a bridge.

Conversely, an errant bit in a given process may have no effect whatsoever, or may manifest benignly e. We, as software developers, need language to support our reasoning and communication about software, and it must address this paradox, otherwise we'll be stuck in fruitless exchanges, often between programmers and non-programmers clients, users, project managers , each of whom, I believe, tend to think and see the software world at different scales.

I will continue the established use of the term correctness to represent exactitude. And I will, influenced somewhat by Meyer and McConnell, use the terms robustness and reliability in addressing the inexact, unpredictable, real behaviour of software entities. Let's look at some code. Remember the first of the Bet-Your-Life?

Test cases from the last instalment [ QM-1 ]:. In fact, it'd be pretty hard to write any implementation other than this. Certainly there are plenty of possibly apocryphal screeds of long-winded alternative implementations available on the web such as on www. With languages that have a bona-fide Boolean type, such as Java and C , the value may not need to be compared against 0 , and may well be implemented as equal to true or to false.

In those, comparison against zero is necessary, even for their built-in bool types! In either case, it's almost impossible to implement this function incorrectly. If we permit ourselves the luxury of assuming a correctly functioning execution environment, then without recourse to any automated techniques, or even to a detailed written specification, we may reasonably assert the correctness of this function by visual inspection.

Now consider the definition of strcmp , the second Bet-Your-Life? Test case:. Even with a function as logically straightforward as strcmp there are different ways to implement it. What was interesting to me was that I had forgotten the nuances resolved in previous implementations, and I initially wrote the last line as:. Then being as how I'm writing a column about software quality, and my Spidey-sense is set to max I stopped and wondered how this would work between compilation contexts where char is signed or unsigned.

Clearly, for certain ranges of values, the negation result will be different between the two. I then immediately set about writing a test program, and building for both signed and unsigned modes, and saw the suspected different behaviour for certain strings with values in the range 0x80 - 0xff.

I've been programming C for, ulp! Either way, it's quite sobering. What this illustrates all too well is that software is exact, humans operate on assumption and expectation, and the two are not good bedfellows. When I spoke to my good friend and regular reviewer, Garth Lancaster, about this, he too was ignorant of the unsigned comparison aspect of strcmp. Reusability: A software product has excellent reusability if different modules of the product can quickly be reused to develop new products.

Correctness: A software product is correct if various requirements as specified in the SRS document have been correctly implemented. Maintainability: A software product is maintainable if bugs can be easily corrected as and when they show up, new tasks can be easily added to the product, and the functionalities of the product can be easily modified, etc.

A quality management system is the principal methods used by organizations to provide that the products they develop have the desired quality. Managerial Structure and Individual Responsibilities: A quality system is the responsibility of the organization as a whole.

However, every organization has a sever quality department to perform various quality system activities. The quality system of an arrangement should have the support of the top management. Without help for the quality system at a high level in a company, some members of staff will take the quality system seriously. Quality System Activities: The quality system activities encompass the following:.

Production of documents for the top management summarizing the effectiveness of the quality system in the organization. Quality systems have increasingly evolved over the last five decades. Before World War II, the usual function to produce quality products was to inspect the finished products to remove defective devices. Since that time, quality systems of organizations have undergone through four steps of evolution, as shown in the fig.

The first product inspection task gave method to quality control QC. Quality control target not only on detecting the defective devices and removes them but also on determining the causes behind the defects. Thus, quality control aims at correcting the reasons for bugs and not just rejecting the products. The next breakthrough in quality methods was the development of quality assurance methods. The primary premise of modern quality assurance is that if an organization's processes are proper and are followed rigorously, then the products are obligated to be of good quality.

consnatuga1970's Ownd

0コメント

1000 / 1000