Abstract
Testing is an important part of any software development project, and can typically surpass more than half of the development cost. For safety-critical computer based systems, testing is even more important due to stringent reliability and safety requirements. However, most safety-critical computer based systems are real-time systems, and the majority of current testing and debugging techniques have been developed for sequential (non real-time) programs. These techniques are not directly applicable to real-time systems, since they disregard issues of timing and concurrency. This means that existing techniques for reproducible testing and debugging cannot be used. Reproducibility is essential for regression testing and cyclic debugging, where the same test cases are run repeatedly with the intention of verifying modified program code or to track down errors. The current trend of consumer and industrial applications goes from single micro-controllers to sets of distributed micro-controllers, which are even more challenging than handling real-time per-see, since multiple loci of observation and control additionally must be considered. In this thesis we try to remedy these problems by presenting an integrated approach to monitoring, testing, and debugging of distributed real-time systems. For monitoring, we present a method for deterministic observations of single tasking, multi-tasking, and distributed real-time systems. This includes a description of what to observe, how to eliminate the disturbances caused by the actual act of observing, how to correlate observations, and how to reproduce them. For debugging, we present a software-based method, which uses deterministic replay to achieve reproducible debugging of single tasking, multi-tasking, and distributed real-time systems. Program executions are deterministically reproduced off-line, using information concerning interrupts, task-switches, timing, data accesses, etc., recorded at runtime. For testing, we introduce a method for deterministic testing of multitasking and distributed real-time systems. This method derives, given a set of tasks and a schedule, all execution orderings that can occur at run-time. Each such ordering is regarded as a sequential program, and by identifying which ordering is actually executed during testing, techniques for testing of sequential software can be applied. For system development, we show the benefits of considering monitoring, debugging, and testing early in the design of real-time system software, and we give examples illustrating how to monitor, test, and debug distributed real-time systems.