mdh.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Consolidating Automotive Real-Time Applications on Many-Core Platforms
Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems. (Complex Real-Time Embedded Systems)ORCID iD: 0000-0002-1276-3609
2017 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Automotive systems have transitioned from basic transportation utilities to sophisticated systems. The rapid increase in functionality comes along with a steep increase in software complexity. This manifests itself in a surge of the number of functionalities as well as the complexity of existing functions. To cope with this transition, current trends shift away from today’s distributed architectures towards integrated architectures, where previously distributed functionality is consolidated on fewer, more powerful, computers. This can ease the integration process, reduce the hardware complexity, and ultimately save costs.

One promising hardware platform for these powerful embedded computers is the many-core processor. A many-core processor hosts a vast number of compute cores, that are partitioned on tiles which are connected by a Network-on-Chip. These natural partitions can provide exclusive execution spaces for different applications, since most resources are not shared among them. Hence, natural building blocks towards temporally and spatially separated execution spaces exist as a result of the hardware architecture.

Additionally to the traditional task local deadlines, automotive applications are often subject to timing constraints on the data propagation through a chain of semantically related tasks. Such requirements pose challenges to the system designer as they are only able to verify them after the system synthesis (i.e. very late in the design process).

In this thesis, we present methods that transform complex timing constraints on the data propagation delay to precedence constraints between individual jobs. An execution framework for the cluster of the many-core is proposed that allows access to cluster external memory while it avoids contention on shared resources by design. A partitioning and configuration of the Network-on-Chip provides isolation between the different applications and reduces the access time from the clusters to external memory. Moreover, methods that facilitate the verification of data propagation delays in each development step are provided. 

Place, publisher, year, edition, pages
Västerås: Malardalen University , 2017.
Series
Mälardalen University Press Dissertations, ISSN 1651-4238 ; 246
Keyword [en]
Many-Core, Automotive, Network-on-Chip, Real-Time, Timing analysis
National Category
Embedded Systems
Research subject
Computer Science
Identifiers
URN: urn:nbn:se:mdh:diva-37182ISBN: 978-91-7485-359-9 (print)OAI: oai:DiVA.org:mdh-37182DiVA: diva2:1154504
Public defence
2017-12-19, Kappa, Mälardalens högskola, Västerås, 09:00 (English)
Opponent
Supervisors
Available from: 2017-11-06 Created: 2017-11-02 Last updated: 2017-11-06Bibliographically approved
List of papers
1. Investigation on AUTOSAR-Compliant solutions for many-core architectures
Open this publication in new window or tab >>Investigation on AUTOSAR-Compliant solutions for many-core architectures
Show others...
2015 (English)In: Proceedings - 18th Euromicro Conference on Digital System Design, DSD 2015, 2015, 95-103 p.Conference paper, Published paper (Refereed)
Abstract [en]

As of today, AUTOSAR is the de facto standard in the automotive industry, providing a common software architecture and development process for automotive applications. While this standard is originally written for singlecore operated Electronic Control Units (ECU), new guidelines and recommendations have been added recently to provide support for multicore architectures. This update came as a response to the steady increase of the number and complexity of the software functions embedded in modern vehicles, which call for the computing power of multicore execution environments. In this paper, we enumerate and analyze the design options and the challenges of porting AUTOSAR-based automotive applications onto multicore platforms. In particular, we investigate those options when considering the emerging many-core architectures that provide a more 'scalable' environment than the traditional multicore systems. Such platforms are suitable to enable massive parallel execution, and their design is more suitable for partitioning and isolating the software components.

Keyword
Automotive, AUTOSAR, E/E architecture, Many-core, Multicore, Application programs, Automobiles, Automotive industry, Control systems, Design, Software architecture, Systems analysis, Automotive applications, Electronic control units, Execution environments, Many core, Multi core, Multicore architectures, Computer architecture
National Category
Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
urn:nbn:se:mdh:diva-31240 (URN)10.1109/DSD.2015.63 (DOI)000382382300013 ()2-s2.0-84958170096 (Scopus ID)9781467380355 (ISBN)
Conference
18th Euromicro Conference on Digital System Design, DSD 2015, 26 August 2015 through 28 August 2015
Available from: 2016-03-03 Created: 2016-03-03 Last updated: 2017-11-02Bibliographically approved
2. Synthesizing Job-Level Dependencies for Automotive Multi-Rate Effect Chains
Open this publication in new window or tab >>Synthesizing Job-Level Dependencies for Automotive Multi-Rate Effect Chains
Show others...
2016 (English)In: The 22th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications RTCSA'16, 2016, Vol. sept, 159-169 p., 579951Conference paper, Published paper (Refereed)
Abstract [en]

Today’s automotive embedded systems comprise a multitude of functionalities, many with complex timing re- quirements. Besides task specific timing requirements, such ap- plications often have timing requirements for the propagation of data through a chain of tasks. An important metric for control applications is the data age, which is addressed in this work. The analysis of such systems is non-trivial because tasks involved in the data propagation may execute at different periods, which leads to over and undersampling within one chain. This work presents a novel method to compute worst- and best-case end-to-end latencies for such systems. A second contribution synthesizes job-level dependencies for such task sets in a way that data paths which exceed the age constraint are eliminated. An extensive evaluation is performed on synthetic task sets and the applicability to industrial applications is demonstrated in a case study.

Keyword
End-to-End LatencyCause-Effect ChainAutomotiveAge ConstraintData Age
National Category
Engineering and Technology Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
urn:nbn:se:mdh:diva-32854 (URN)10.1109/RTCSA.2016.41 (DOI)000387085600031 ()2-s2.0-84994493307 (Scopus ID)
Conference
The 22th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications RTCSA'16, 17 Aug 2016, Daegu, South Korea
Projects
PREMISE - Predictable Multicore SystemsDPAC - Dependable Platforms for Autonomous systems and Control
Available from: 2016-09-30 Created: 2016-08-24 Last updated: 2017-11-02Bibliographically approved
3. A Generic Framework Facilitating Early Analysis of Data Propagation Delays in Multi-Rate Systems
Open this publication in new window or tab >>A Generic Framework Facilitating Early Analysis of Data Propagation Delays in Multi-Rate Systems
Show others...
2017 (English)In: The 23th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications RTCSA'17, 2017, 8046323Conference paper, Published paper (Refereed)
Abstract [en]

A majority of multi-rate real-time systems are constrained by a multitude of timing requirements, in addition to the traditional deadlines on well-studied response times. This means, the timing predictability of these systems not only depends on the schedulability of certain task sets but also on the timely propagation of data through the chains of tasks from sensors to actuators. In the automotive industry, four different timing constraints corresponding to various data propagation delays are commonly specified on the systems. This paper identifies and addresses the source of pessimism as well as optimism in the calculations for one such delay, namely the reaction delay, in the state-of-the-art analysis that is already implemented in several industrial tools. Furthermore, a generic framework is proposed to compute all the four end-to-end data propagation delays, complying with the established delay semantics, in a scheduler and hardware-agnostic manner. This allows analysis of the system models already at early development phases, where limited system information is present. The paper further introduces mechanisms to generate job-level dependencies, a partial ordering of jobs, which need to be satisfied by any execution platform in order to meet the data propagation timing requirements. The job-level dependencies are first added to all task chains of the system and then reduced to its minimum required set such that the job order is not affected. Moreover, a necessary schedulability test is provided, allowing for varying the number of CPUs. The experimental evaluations demonstrate the tightness in the reaction delay with the proposed framework as compared to the existing state-of-the-art and practice solutions.

Keyword
Data Propagation Delay, End-to-End Delay, Real-Time, Automotive
National Category
Computer Systems
Identifiers
urn:nbn:se:mdh:diva-37034 (URN)10.1109/RTCSA.2017.8046323 (DOI)2-s2.0-85032739692 (Scopus ID)
Conference
The 23th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications RTCSA'17, 16 Aug 2017, Hsinchu, Taiwan
Projects
PREMISE - Predictable Multicore SystemsDPAC - Dependable Platforms for Autonomous systems and ControlPreView: Developing Predictable Vehicle Software on Multi-core
Available from: 2017-11-02 Created: 2017-11-02 Last updated: 2017-11-16Bibliographically approved
4. End-to-End Timing Analysis of Cause-Effect Chains in Automotive Embedded Systems
Open this publication in new window or tab >>End-to-End Timing Analysis of Cause-Effect Chains in Automotive Embedded Systems
Show others...
2017 (English)In: Journal of systems architecture, ISSN 1383-7621, E-ISSN 1873-6165, Vol. 80, no Supplement C, 104-113 p.Article in journal (Refereed) Published
Abstract [en]

Automotive embedded systems are subjected to stringent timing requirements that need to be verified. One of the most complex timing requirement in these systems is the data age constraint. This constraint is specified on cause- effect chains and restricts the maximum time for the propagation of data through the chain. Tasks in a cause-effect chain can have different activation patterns and different periods, that introduce over- and under-sampling effects, which additionally aggravate the end-to-end timing analysis of the chain. Furthermore, the level of timing information available at various development stages (from modeling of the software architecture to the software implementation) varies a lot, the complete timing information is available only at the implementation stage. This uncertainty and limited timing information can restrict the end-to-end timing analysis of these chains. In this paper, we present methods to compute end-to-end delays based on different levels of system information. The characteristics of different communication semantics are further taken into account, thereby enabling timing analysis throughout the development process of such heterogeneous software systems. The presented methods are evaluated with extensive experiments. As a proof of concept, an industrial case study demonstrates the applicability of the proposed methods following a state-of-the-practice development process.

Keyword
Data Propagation DelayAutomotive Real-Time
National Category
Embedded Systems
Identifiers
urn:nbn:se:mdh:diva-37084 (URN)10.1016/j.sysarc.2017.09.004 (DOI)000413883100010 ()2-s2.0-85031742078 (Scopus ID)
Projects
PREMISE - Predictable Multicore SystemsDPAC - Dependable Platforms for Autonomous systems and ControlPreView: Developing Predictable Vehicle Software on Multi-core
Available from: 2017-10-27 Created: 2017-10-27 Last updated: 2017-11-16Bibliographically approved
5. Contention-Free Execution of Automotive Applications on a Clustered Many-Core Platform
Open this publication in new window or tab >>Contention-Free Execution of Automotive Applications on a Clustered Many-Core Platform
Show others...
2016 (English)In: 28th Euromicro Conference on Real-Time Systems ECRTS'16, Toulouse, France, 2016, 14-24 p.Conference paper, Published paper (Refereed)
Abstract [en]

Next generations of compute-intensive real-time applications in automotive systems will require more powerful computing platforms. One promising power-efficient solution for such applications is to use clustered many-core architectures. However, ensuring that real-time requirements are satisfied in the presence of contention in shared resources, such as memories, remains an open issue. This work presents a novel contention-free execution framework to execute automotive applications on such platforms. Privatization of memory banks together with defined access phases to shared memory resources is the backbone of the framework. An Integer Linear Programming (ILP) formulation is presented to find the optimal time-triggered schedule for the on-core execution as well as for the access to shared memory. Additionally a heuristic solution is presented that generates the schedule in a fraction of the time required by the ILP. Extensive evaluations show that the proposed heuristic performs only 0.5% away from the optimal solution while it outperforms a baseline heuristic by 67%. The applicability of the approach to industrially sized problems is demonstrated in a case study of a software for Engine Management Systems.

Place, publisher, year, edition, pages
Toulouse, France: , 2016
Keyword
Many-CoreExecution FrameworkAutomotiveClustered ArchitectureTime Triggered Scheduling
National Category
Computer Systems
Identifiers
urn:nbn:se:mdh:diva-32844 (URN)10.1109/ECRTS.2016.14 (DOI)000389463400002 ()2-s2.0-84989911785 (Scopus ID)978-1-5090-2811-5 (ISBN)
Conference
28th Euromicro Conference on Real-Time Systems ECRTS'16, 05 Jul 2016, Toulouse, France
Projects
PREMISE - Predictable Multicore Systems
Available from: 2016-09-30 Created: 2016-08-24 Last updated: 2017-11-02Bibliographically approved
6. Partitioning and Analysis of the Network-on-Chip on a COTS Many-Core Platform
Open this publication in new window or tab >>Partitioning and Analysis of the Network-on-Chip on a COTS Many-Core Platform
Show others...
2017 (English)In: 23rd IEEE Real-Time and Embedded Technology and Applications Symposium RTAS'17, 2017, 101-112 p.Conference paper, Published paper (Refereed)
Abstract [en]

Many-core processors can provide the computational power required by future complex embedded systems. However, their adoption is not trivial, since several sources of interference on COTS many-core platforms have adverse effects on the resulting performance. One main source of performance degradation is the contention on the Network-on-Chip, which is used for communication among the compute cores via the off- chip memory. Available analysis techniques for the traversal time of messages on the NoC do not consider many of the architectural features found on COTS platforms. In this work, we target a state-of-the-art many-core processor, the Kalray MPPA R . A novel partitioning strategy for reducing the contention on the NoC is proposed. Further, we present an analysis technique dedicated to the proposed partitioning strategy, which considers all architectural features of the COTS NoC. Additionally, it is shown how to configure the parameters for flow-regulation on the NoC, such that the Worst-Case Traversal Time (WCTT) is minimal and buffers never overflow. The benefits of our approach are evaluated based on extensive experiments that show that contention is significantly reduced compared to the unconstrained case, while the proposed analysis outperforms a state-of-the-art analysis for the same platform. An industrial case study shows the tightness of the proposed analysis.

Keyword
Many-CoreNetwork-on-ChipPartitioningReal-TimeMemory Access
National Category
Computer Systems
Identifiers
urn:nbn:se:mdh:diva-35460 (URN)10.1109/RTAS.2017.32 (DOI)000411195100009 ()2-s2.0-85021824983 (Scopus ID)978-1-5090-5269-1 (ISBN)
Conference
23rd IEEE Real-Time and Embedded Technology and Applications Symposium RTAS'17, 18-21 Apr 2017, Pittsburgh PA, United States
Projects
PREMISE - Predictable Multicore SystemsDPAC - Dependable Platforms for Autonomous systems and Control
Available from: 2017-06-12 Created: 2017-06-12 Last updated: 2017-11-02Bibliographically approved
7. Scheduling Multi-Rate Real-Time Applications on Clustered Many-Core Architectures with Memory Constraints
Open this publication in new window or tab >>Scheduling Multi-Rate Real-Time Applications on Clustered Many-Core Architectures with Memory Constraints
Show others...
(English)In: 23rd Asia and South Pacific Design Automation Conference ASP-DAC'18Conference paper, Published paper (Refereed)
Abstract [en]

Access to shared memory is one of the main chal- lenges for many-core processors. One group of scheduling strategies for such platforms focuses on the division of tasks’ access to shared memory and code execution. This allows to orchestrate the access to shared local and off-chip memory in a way such that access contention between different compute cores is avoided by design. In this work, an execution framework is introduced that leverages local memory by statically allocating a subset of tasks to cores. This reduces the access times to shared memory, as off-chip memory access is avoided, and in turn improves the schedulability of such systems. A Constrained Programming (CP) formulation is presented to selects the statically allocated tasks and generates the complete system schedule. Evaluations show that the pro- posed approach yields an up to 21% higher schedulability ratio than related work, and a case study demonstrates its applicability to industrial problems.

Keyword
Many-CoreContention-Free ExecutionReal-TimeMemory Constraints
National Category
Computer Systems
Identifiers
urn:nbn:se:mdh:diva-37064 (URN)
Conference
23rd Asia and South Pacific Design Automation Conference ASP-DAC'18, 22 Jan 2018, Jeju Island, South Korea
Projects
PREMISE - Predictable Multicore SystemsDPAC - Dependable Platforms for Autonomous systems and ControlPreView: Developing Predictable Vehicle Software on Multi-core
Available from: 2017-11-02 Created: 2017-11-02 Last updated: 2017-11-02Bibliographically approved

Open Access in DiVA

No full text

Authority records BETA

Becker, Matthias

Search in DiVA

By author/editor
Becker, Matthias
By organisation
Embedded Systems
Embedded Systems

Search outside of DiVA

GoogleGoogle Scholar

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 82 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf