Critical systems often use N-modular redundancy to tolerate faults in subsystems. Traditional approaches to N-modular redundancy in distributed, loosely-synchronised, real-time systems handle time and value errors separately: a voter detects value errors, while watchdog-based health monitoring detects timing errors. In prior work, we proposed the integrated Voting on Time and Value (VTV) strategy, which allows both timing and value errors to be detected simultaneously. In this paper, we show how VTV can be harnessed as part of an overall fault tolerance strategy and evaluate its performance using a well-known control application, the Inverted Pendulum. Through extensive simulations, we compare the performance of Inverted Pendulum systems which employs VTV and alternative voting strategies to demonstrate that VTV better tolerates well-recognised faults in this realistically complex control problem.
In some domains, standards such as ISO 26262 or the UK Ministry of DefenceÂ’s Defence Standard 00-56 require developers to produce a safety case. As the safety case for a complex system can be rather large, automated verification of all or part of it would be valuable. We have approached the issue by designing a method supported by a framework including analysers for safety cases defined in the Goal Structuring Notation (GSN) and systems modelled in the Architecture Analysis and Design Language (AADL). In our approach, the safety case predicates are defined in a subset of the functional language Meta Language (ML). Our approach facilities formalising some parts of a typical safety argument in an ML-like notation, enabling automatic verification of some reasoning steps in the safety argument. Automatic verification not only justifies increased confidence, it can ease the burden of re-checking the safety argument as it (and the system) change.
For a large and complex safety-critical system, where safety is ensured by a strict control over many properties, the safety information is structured into a safety case. As a small change to the system design may potentially affect a large section of the safety argumentation, a systematic method for evaluating the impact of system changes on the safety argumentation would be valuable. We have chosen two of the most common notations: the Goal Structuring Notation (GSN) for the safety argumentation and the Architecture Analysis and Design Language (AADL) for the system architecture model. In this paper, we address the problem of impact analysis by introducing the GSN and AADL Graph Evaluation (GAGE) method that maps safety argumentation structure against system architecture, which is also a prerequisite for successful composition of modular safety cases. In order to validate the method, we have implemented the GAGE tool that supports the mapping between the GSN and AADL notations and highlight changes in impact on the argumentation. © 2012 IEEE.
Building and operating safety critical systems requires making decisions about risk mitigations and their sufficiency. These decisions depend upon a safety case comprising many interrelated pieces of safety evidence and an argument linking these to a claim of adequate safety. Safety argumentation research has drawn from several other disciplines, including the law, philosophy, and artificial intelligence. But it has not yet achieved a normative theory of assurance argumentation that makes trustworthy predictions about the accuracy of safety claims. In this paper, we discuss the aims, achievements, and challenges of safety argumentation theory research and identify opportunities for future work.
An assurance case comprises evidence and argument showing how that evidence supports assurance claims (e.g., about safety or security). It is unsurprising that some computer scientists have proposed formalising assurance arguments: most associate formality with rigour. But while engineers can sometimes prove that source code refines a formal specification, it is not clear that formalisation will improve assurance arguments or that this benefit is worth its cost. For example, formalisation might reduce the benefits of argumentation by limiting the audience to people who can read formal logic. In this paper, we present (1) a systematic survey of the literature surrounding formal assurance arguments, (2) an analysis of errors that formalism can help to eliminate, (3) a discussion of existing evidence, and (4) suggestions for experimental work to definitively answer the question.
The Goal Structuring Notation (GSN) is a popular graphical notation for recording safety arguments. One of GSN's key innovations is a context element that links short phrases used in the argument to detail available elsewhere. However, definitions of the context element admit multiple interpretations and conflict with guidance for building assured safety arguments. If readers do not share an understanding of the meaning of context that makes context's impact on the main safety claim clear, confidence in safety might be misplaced. In this paper, we analyse the definitions and usage of GSN context elements, identify contradictions and vagueness, propose a more precise definition, and make updated recommendations for assured safety argument structure.
Reasoning about system safety requires reasoning about confidence in safety claims. For example, DO-178B requires developers to determine the correctness of the worst-case execution time of the software. It is not possible to do this beyond any doubt. Therefore, developers and assessors must consider the limitations of execution time evidence and their effect on the confidence that can be placed in execution time figures, timing analysis results, and claims to have met timing-related software safety requirements. In this paper, we survey and assess existing concepts that might serve as means of describing and reasoning about confidence, including safety integrity levels, probability distributions of failure rates, Bayesian Belief Networks, argument integrity levels, and Baconian probability. We define use cases for confidence in safety cases, prescriptive standards, certification of component- based systems, and the reuse of safety elements both in and out of context. From these use cases, we derive requirements for a confidence framework. We assess existing techniques by discussing what is known about how well each confidence metric meets these requirements. Our results show that no existing confidence metric is ideally suited for all uses. We conclude by discussing implications for future standards and for reuse of safety elements.
Component-based software engineering researchers have explored component reuse, typically at the source-code level. Contracts explicitly describe component behaviour, reducing development risk by exposing potential incompatibilities early. But to benefit fully from reuse, developers of safety-critical systems must also reuse safety evidence. Full reuse would require both extending the existing notion of component contracts to cover safety properties and using these contracts in both component selection and system certification. In this paper, we explore some of the ways in which this is not as simple as it first appears.
Timing is often seen as the most important property of systems after function, and safety-critical systems are no exception. In this paper, we consider how timing is typically treated in safety assurance and in particular the safety arguments being proposed by industry and academia. A critique of these arguments is performed based on how systems are generally developed and how evidence is gathered. Significant weaknesses are exposed resulting in a more appropriate safety argument being proposed. As part of this work techniques for identifying relationships, in the form of contracts, between parts of the argument and the strength of evidence are used. The work is demonstrated using a Computer Assisted Braking example, specifically an Anti-Lock Braking System for a car, as it is a classic example of a component that may be used ?Out of Context?, as discussed in a number of safety standards, and may also be reused across a number of systems as well as part of a product line.
Software engineering researchers have extensively explored the reuse of components at source-code level. Contracts explicitly describe component behaviour, reducing development risk by exposing potential incompatibilities early in the development process. But to benefit fully from reuse, developers of safety-critical systems must also reuse safety evidence. Full reuse would require both extending the existing notion of component contracts to cover safety properties and using these contracts in both component selection and system certification. This is not as simple as it first appears. Much of the review, analysis, and test evidence developers provide during certification is system-specific. This makes it difficult to define safety contracts that facilitate both selecting components to reuse and certifying systems. In this paper, we explore the definition and use of safety contracts, identify challenges to component-based software reuse safety-critical systems, present examples to illustrate several key difficulties, and discuss potential solutions to these problems.
Conformance to software standards plays an essential role in establishing confidence in high-integrity software systems. However, standards conformance suffers from uncertainty about its meaning for three reasons: because requirements of the standard must be interpreted to fit the specifics of the application; because standards can deliberately leave options for developers; and because goal-based software standards exist that simply specify the high-level principles of software assurance without prescribing a specific means of compliance. The overall effect of these issues is that when conformance to a software assurance standard is claimed, there can be a lack of clarity as to exactly what the claim entails. This article draws on principles and practice from the domain of safety argument construction to describe the use of explicit and structured conformance arguments to help address this problem.
Context: Many people and organisations rely upon software safety and security standards to provide confidence in software intensive systems. For example, people rely upon the Common Criteria for Information Technology Security Evaluation to establish justified and sufficient confidence that an evaluated information technology product's contributions to security threats and threat management are acceptable. Is this standard suitable for this purpose? Objective: We propose a method for assessing whether conformance with a software safety or security standard is sufficient to support a conclusion such as adequate safety or security. We hypothesise that our method is feasible and capable of revealing interesting issues with the proposed use of the assessed standard. Method: The software safety and security standards with which we are concerned require evidence and discuss the objectives of that evidence. Our method is to capture a standard's evidence and objectives as an argument supporting the desired conclusion and to subject this argument to logical criticism. We have evaluated our method by case study application to the Common Criteria standard. Results: We were able to capture and criticise an argument from the Common Criteria standard. Review revealed 121 issues with the analysed use of the standard. These range from vagueness in its text to failure to require evidence that would substantially increase confidence in the security of evaluated software. Conclusion: Our method was feasible and revealed interesting issues with using a Common Criteria evaluation to support a conclusion of adequate software security. Considering the structure of similar assurance standards, we see no reason to believe that our method will not prove similarly valuable in other applications. © 2013 Elsevier B.V. All rights reserved.
Many systems deliberately manage interference between software components, e.g. through partitioning. When engineers modifying such software determine which items of verification evidence have been invalidated by changes, they consider interference management measures. A complete understanding ofinterference and its management is crucial when engineers re-use evidence. In prior work, we suggested: (a) a guided process for identifying interference and means of managing it; and (b) a strategy for arguing about interference management. In this paper, we present the results of a case study meant to answer two questions raised by this prior work: (i) which views of the system engineers should consider when identifying interference and its management; and (ii) whether our argument pattern captures a practical way to argue about interference management.
Assurance Based Development (ABD) is a novel approach to the synergistic construction of critical software systems and their assurance arguments. In ABD, the need for assurance drives a unique process synthesis mechanism that results in a detailed process for building both software and an argument demonstrating its fitness for use in given operating contexts. In this paper, we introduce the ABD process synthesis mechanism. A key element of ABD process synthesis is the success argument, an argument which documents developers' rationale for believing that the development effort in progress will result in a system that demonstrably meets an acceptable balance of all stakeholder goals. Such goals include safety and security requirements for systems using the software as a component and time and budget constraints. We also present the details of a case study in which we used ABD to develop the control software for a prototype artificial heart pump.
Unmanned Aircraft Systems (UASs) rely upon a significant amount of software. An appropriate authorizing agent must approve the use of the UAS in a desired environment before use, and the authorization approach used must contend with the UAS' software. Unfortunately, the variety of UAS types, the range ofenvironments within which they must operate, and the need to address both safety and security concerns make authorization of UAS software problematic. We have developed a flexible approach to UAS software authorization that is able to deal with these challenges. Our approach is based on rigorous fitness arguments that explain how evidence from the software's development shows that the software has the properties that make the UAS fit for use in the intended operating contexts. In this paper, we present the details of our approach, compare it to existing approaches, and show how retroactive construction of softwarefitness arguments can identify the additional evidence necessary to support full authorization or the limited authorization that can be granted based on existing evidence. We give examples to illustrate how our approach can be used across a wide variety of UASs, missions, and operating environments, including controlled airspace.
The technology for building dependable computing systems has advanced dramatically. Nevertheless, there is still no complete solution to building software for critical systems in which every aspect of software dependability can be demonstrated with high confidence. In this paper, we present the results of a case study exploration of the practical limitations on software dependability. We analyze a software assurance argument for weaknesses and extrapolate a set of limitations including dependence upon correct requirements, dependence upon reliable human-to-human communication, dependence upon human compliance with protocols, dependence upon unqualified tools, the difficulty of verifying low-level code, and the limitations of testing. We discuss each limitation's impact on our specimen system and potential mitigations.
Developers of some safety critical systems construct a safety case. Developers changing a system during development or after release must analyse the change's impact on the safety case. Evidence might be invalidated by changes to the system design, operation, or environmental context. Assumptions valid in one context might be invalid elsewhere. The impact of change might not be obvious. This paper proposes a method to facilitate safety case maintenance by highlighting the impact of changes.
Preliminary safety assessment is an important activity in safety systems development since it provides insight into the proposed system’s ability to meet its safety requirements. Because preliminary safety assessment is conducted before the system is implemented, developers rely on high-level designs of the system to assess safety in order to reduce the risk of finding issues later in the process. Since system architecture is the first design artefact developers produce, developers invest considerable time in assessing the architecture’s impact on system safety. Typical safety standards require developers to show that a plan of safety activities, chosen from recommended options or alternatives, meets a set of objectives. More specifically, the automotive safety standard ISO 26262 recommends formally verifying the software architecture to show that it “complies” with safety requirements. In this paper, we apply an architecture-based verification technique for Architecture Analysis and Design Language (AADL) specifications to an architectural design for a fuel level estimation system to validate certain architectural properties. Subsequently, we build part of the conformance argument to show how the model checking can satisfy some ISO 26262 obligations. Furthermore, we show how the method could be used as a part of preliminary safety assessments and how it can be upheld by the later implementations beside of the other recommend methods.
Mutation testing has been used to assess test suite coverage, and researchers have proposed adapting the idea for other uses. Safety kernels allow the use of untrusted software components in safety-critical applications: a trusted software safety kernel detects undesired behavior and takes remedial action. We propose to use specification mutation, model checking, and model-based testing to verify safety kernels for component-based, safety-critical computer systems.
The use of contracts to enhance the maintainability of safety-critical systems has received a significant amount of research effort in recent years. However some key issues have been identified: the difficulty in dealing with the wide range of properties of systems and deriving contracts to capture those properties; and the challenge of dealing with the inevitable incompleteness of the contracts. In this paper, we explore how the derivation of contracts can be performed based on the results of failure analysis. We use the concept of safety kernels to alleviate the issues. Firstly the safety kernel means that the properties of the system that we may wish to manage can be dealt with at a more abstract level, reducing the challenges of representation and completeness of the “safety” contracts. Secondly the set of safety contracts is reduced so it is possible to reason about their satisfaction in a more rigorous manner.