mdh.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Improving the Stop-Test Decision When Testing Data are Slow to Converge
Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.ORCID iD: 0000-0003-4127-5839
Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems. University of York, York, UK.ORCID iD: 0000-0003-2415-8219
2016 (English)Report (Other academic)
Abstract [en]

Testing of safety-critical systems is an important and costly endeavor. To date work has been mainly focusing on the design and application of diverse testing strategies. However, they have left the important decision of “when to stop testing” as an open research issue. In our previous work, we proposed a convergence algorithm that informs the tester when it is concluded that testing for longer will not reveal sufficiently important new findings, hence, should be stopped. The stoptest decision proposed by the algorithm was in the context of testing the worst-case timing characteristics of a system and was evaluated based on the As Low As Reasonably Practicable (ALARP) principle. The ALARP principle is an underpinning concept in many safety standards which is a cost-benefit argument. ALARP implies that a tolerable risk should be reduced to a point at which further risk-reduction is grossly disproportionate compared to the benefit attained. An ALARP stop-test decision means that the cost associated with further testing, after the algorithm stops, does not justify the benefit, i.e., any further increased in the observed worst-case timing.

In order to make a stop-test decision, the convergence algorithm used the Kullback-Leibler DIVergence (KL DIV) statistical test and was shown to be successful while being applied on system’s tasks having similar characteristics. However, there were some experiments in which the stop-test decision did not comply to the ALARP principle, i.e., it stopped sooner than expected by the ALARP criteria. Therefore, in this paper, we investigate whether the performance of the algorithm could be improved in such experiments focusing on the KL DIV test. More specifically, we firstly determine which features of KL DIV could adversely affect the algorithm performance. Secondly, we investigate whether another statistical test, i.e., the Earth Mover’s Distance (EMD), could potentially cover weaknesses of KL DIV. Finally, we experimentally evaluate our hypothesis of whether EMD does improve the algorithm where KL DIV has shown to not perform as expected.

Place, publisher, year, edition, pages
Sweden: Mälardalen Real-Time Research Centre, Mälardalen University , 2016.
Series
MRTC Reports, ISSN 1404-3041
National Category
Computer Systems
Identifiers
URN: urn:nbn:se:mdh:diva-32583ISRN: MDH-MRTC-310/2016-1-SEOAI: oai:DiVA.org:mdh-32583DiVA: diva2:953823
Projects
SYNOPSIS - Safety Analysis for Predictable Software Intensive Systems
Available from: 2016-08-18 Created: 2016-08-18 Last updated: 2016-12-13Bibliographically approved
In thesis
1. An ALARP Stop-Test Decision for the Worst-Case Timing Characteristics of Safety-Critical Systems
Open this publication in new window or tab >>An ALARP Stop-Test Decision for the Worst-Case Timing Characteristics of Safety-Critical Systems
2016 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]

Safety-critical systems are those in which failure can lead to loss of people’s lives, or catastrophic damage to the environment. Timeliness is an important requirement in safety-critical systems, which relates to the notion of response time, i.e., the time a system takes to respond to stimuli from the environment. If the response time exceeds a specified time interval, a catastrophe might occur.

 

Stringent timing requirements make testing a necessary and important process with which not only the correct system functionality has to be verified but also the system timing behaviour. However, a key issue for testers is to determine when to stop testing, as stopping too early may result in defects remaining in the system, or a catastrophe due to high severity level of undiscovered defects; and stopping too late will result in waste of time and resources. To date, researchers and practitioners have mainly focused on the design and application of diverse testing strategies, leaving the critical stop-test decision a largely open issue, especially with respect to timeliness.

 

In the first part of this thesis, we propose a novel approach to make a stop-test decision in the context of testing the worst-case timing characteristics of systems. More specifically, we propose a convergence algorithm that informs the tester whether further testing would reveal significant new insight into the timing behaviour of the system, and if not, it suggests testing to be stopped. The convergence algorithm looks into the observed response times achieved by testing, and examines whether the Maximum Observed Response Time (MORT) has recently increased, and when this is no longer the case, it investigates if the distribution of response times has changed significantly. When no significant new information about the system is revealed during a given period of time it is concluded, with some statistical confidence, that more testing of the same nature is not going to be useful. However, some other testing techniques may still achieve significant new findings.

 

Furthermore, the convergence algorithm is evaluated based on the As Low As Reasonably Practicable (ALARP)  principle which is an underpinning concept in most safety standards. ALARP involves weighting benefit against the associated cost. In order to evaluate the convergence algorithm, it is shown that the sacrifice, here testing time, would be grossly disproportionate compared to the benefit attained, which in this context is any further significant increase in the MORT after stopping the test.

 

Our algorithm includes a set of tunable parameters. The second part of this work is to improve the algorithm performance and scalability through the following steps: firstly, it is determined whether the parameters do affect the algorithm. Secondly, the most influential parameters are identified and tuned. This process is based on the Design of Experiment (DoE)  approach.

 

Moreover, the algorithm is required to be robust, which in this context is defined “the algorithm provides valid stop-test decisions across a required range of task sets”. For example, if the system’s number of tasks varies from 10 to 50 tasks and the tasks’ periods change from the range [200 μ s, 400 μ s] to the range [200 μ s, 1000 μ s], the algorithm performance would not be adversely affected. In order to achieve robustness, firstly, the most influential task set parameters on the algorithm performance are identified by the Analysis of Variance (ANOVA)  approach. Secondly, it is examined whether the algorithm is sound over some required ranges of those parameters, and if not, the situations in which the algorithm’s performance significantly degrades are identified. Then, these situations will be used in our future work to stress test the algorithm and to tune it so that it becomes robust across the required ranges.

 

Finally, the convergence algorithm was shown to be successful while being applied on task sets having similar characteristics. However, we observe some experiments in which the algorithm could not suggest a proper stop-test decision in compliance to the ALARP principle, e.g., it stops sooner than expected. Therefore, we examine whether the algorithm itself can be further improved focusing on the statistical test it uses and if another test would perform better.

Place, publisher, year, edition, pages
Västerås: Mälardalen University, 2016
Series
Mälardalen University Press Licentiate Theses, ISSN 1651-9256 ; 238
National Category
Computer and Information Science
Research subject
Computer Science
Identifiers
urn:nbn:se:mdh:diva-32588 (URN)978-91-7485-279-0 (ISBN)
Presentation
2016-09-19, Gamma, Mälardalens högskola, Västerås, 13:00 (English)
Opponent
Supervisors
Available from: 2016-08-19 Created: 2016-08-18 Last updated: 2016-12-27Bibliographically approved

Open Access in DiVA

No full text

Other links

http://www.es.mdh.se/pdf_publications/4443.pdf

Search in DiVA

By author/editor
Malekzadeh, MahnazBate, Iain
By organisation
Embedded Systems
Computer Systems

Search outside of DiVA

GoogleGoogle Scholar

Total: 19 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf