mdh.sePublications
Change search
Link to record
Permanent link

Direct link
BETA
Alternative names
Publications (10 of 185) Show all publications
Loni, M., Sinaei, S., Zoljodi, A., Daneshtalab, M. & Sjödin, M. (2020). DeepMaker: A multi-objective optimization framework for deep neural networks in embedded systems. Microprocessors and microsystems, 73, Article ID 102989.
Open this publication in new window or tab >>DeepMaker: A multi-objective optimization framework for deep neural networks in embedded systems
Show others...
2020 (English)In: Microprocessors and microsystems, ISSN 0141-9331, E-ISSN 1872-9436, Vol. 73, article id 102989Article in journal (Refereed) Published
Abstract [en]

Deep Neural Networks (DNNs) are compute-intensive learning models with growing applicability in a wide range of domains. Due to their computational complexity, DNNs benefit from implementations that utilize custom hardware accelerators to meet performance and response time as well as classification accuracy constraints. In this paper, we propose DeepMaker framework that aims to automatically design a set of highly robust DNN architectures for embedded devices as the closest processing unit to the sensors. DeepMaker explores and prunes the design space to find improved neural architectures. Our proposed framework takes advantage of a multi-objective evolutionary approach that exploits a pruned design space inspired by a dense architecture. DeepMaker considers the accuracy along with the network size factor as two objectives to build a highly optimized network fitting with limited computational resource budgets while delivers an acceptable accuracy level. In comparison with the best result on the CIFAR-10 dataset, a generated network by DeepMaker presents up to a 26.4x compression rate while loses only 4% accuracy. Besides, DeepMaker maps the generated CNN on the programmable commodity devices, including ARM Processor, High-Performance CPU, GPU, and FPGA. 

Place, publisher, year, edition, pages
Elsevier B.V., 2020
National Category
Computer Systems
Identifiers
urn:nbn:se:mdh:diva-46792 (URN)10.1016/j.micpro.2020.102989 (DOI)000520940000032 ()2-s2.0-85077516447 (Scopus ID)
Available from: 2020-01-23 Created: 2020-01-23 Last updated: 2020-04-09Bibliographically approved
Bucaioni, A., Lundbäck, J., Nolin, M. & Mubeen, S. (2020). On Model-based Development of Embedded Software for Evolving Automotive E/E Architectures. In: 17th International Conference on Information Technology : New Generations ITNG'20: . Paper presented at 17th International Conference on Information Technology : New Generations ITNG'20, 01 Mar 2020, Las Vegas, United States. Las Vegas, United States
Open this publication in new window or tab >>On Model-based Development of Embedded Software for Evolving Automotive E/E Architectures
2020 (English)In: 17th International Conference on Information Technology : New Generations ITNG'20, Las Vegas, United States, 2020Conference paper, Published paper (Refereed)
Abstract [en]

Fueled by an increasing demand for computational power and high data-rate low-latency on-board communication, the automotive electrical and electronic architectures are evolving from distributed to consolidated domain and centralised architectures. Future electrical and electronic automotive architectures are envisioned to leverage heterogeneous computing platforms, where several different processing units will be embedded within electronic control units. These powerful control units are expected to be connected by high-bandwidth and low-latency on-board backbone networks. This paper draws on the industrial collaboration with the Swedish automotive industry for tackling the challenges associated to the model-based development of predictable embedded software for contemporary and evolving automotive E/E architectures.

Place, publisher, year, edition, pages
Las Vegas, United States: , 2020
National Category
Computer Systems
Identifiers
urn:nbn:se:mdh:diva-47341 (URN)
Conference
17th International Conference on Information Technology : New Generations ITNG'20, 01 Mar 2020, Las Vegas, United States
Projects
DPAC - Dependable Platforms for Autonomous systems and ControlHERO: Heterogeneous systems - software-hardware integrationDESTINE: Developing Predictable Vehicle Software Utilizing Time Sensitive NetworkingAutomation in High-performance Cyber Physical Systems DevelopmentPANORAMA - Boosting Design Efficiency for Heterogeneous³ Systems
Available from: 2020-04-24 Created: 2020-04-24 Last updated: 2020-04-24
Tsog, N., Sjödin, M. & Bruhn, F. (2019). A Trade-Off between Computing Power and Energy Consumption of On-Board Data Processing in GPU Accelerated Real-Time Systems. In: : . Paper presented at The 32nd International Symposium on Space Technology and Science, Fukui, Japan.
Open this publication in new window or tab >>A Trade-Off between Computing Power and Energy Consumption of On-Board Data Processing in GPU Accelerated Real-Time Systems
2019 (English)Conference paper, Published paper (Refereed)
Abstract [en]

On-board data processing is one of the prior on-orbit activities that it improves the performance capability of in-orbit space systems such as deep-space exploration, earth and atmospheric observation satellites, and CubeSat constellations. However, on-board data processing encounters with higher energy consumption compared to traditional space systems. Because traditional space systems employ simple processing units such as micro-controllers or a single-core processor as the systems require no heavy data processing on orbit. Moreover, solving the radiation hardness problem is crucial in space and adopting a new processing unit is challenging.

In this paper, we consider a GPU accelerated real-time system for on-board data processing. According to prior works, there exist radiation-tolerant GPU, and the computing capability of systems is improved by using heterogeneous computing method. We conduct experimental observations of power consumption and computing potential using this heterogeneous computing method in our GPU accelerated real-time system.The results show that the proper use of GPU increases computing potential with 10-140 times and consumes between 8-130 times less energy. Furthermore, the entire task system consumes 10-65% of less energy compared to the traditional use of processing units.

Keywords
Trade-off, Computing power, Energy consumption, on-board data processing, GPU acceleration, Real-time systems
National Category
Engineering and Technology Computer Systems
Identifiers
urn:nbn:se:mdh:diva-45938 (URN)
Conference
The 32nd International Symposium on Space Technology and Science, Fukui, Japan
Projects
DPAC - Dependable Platforms for Autonomous systems and Control
Available from: 2019-11-11 Created: 2019-11-11 Last updated: 2019-11-22Bibliographically approved
Loni, M., Hamouachy, F., Casarrubios, C., Daneshtalab, M. & Nolin, M. (2019). AutoRIO: An Indoor Testbed for Developing Autonomous Vehicles. In: International Japan-Africa Conference on Electronics, Communications and Computations JAC-ECC: . Paper presented at International Japan-Africa Conference on Electronics, Communications and Computations JAC-ECC, 16 Dec 2018, Alexandria, Egypt (pp. 69-72).
Open this publication in new window or tab >>AutoRIO: An Indoor Testbed for Developing Autonomous Vehicles
Show others...
2019 (English)In: International Japan-Africa Conference on Electronics, Communications and Computations JAC-ECC, 2019, p. 69-72Conference paper, Published paper (Refereed)
Abstract [en]

Autonomous vehicles have a great influence on our life. These vehicles are more convenient, more energy efficient providing higher safety level and cheaper driving solutions. In addition, decreasing the generation of CO 2 , and the risk vehicular accidents are other benefits of autonomous vehicles. However, leveraging a full autonomous system is challenging and the proposed solutions are newfound. Providing a testbed for evaluating new algorithms is beneficial for researchers and hardware developers to verify the real impact of their solutions. The existence of testing environment is a low-cost infrastructure leading to increase the time-to-market of novel ideas. In this paper, we propose Auto Rio, a cutting-edge indoor testbed for developing autonomous vehicles.

National Category
Engineering and Technology Computer Systems
Identifiers
urn:nbn:se:mdh:diva-42236 (URN)10.1109/JEC-ECC.2018.8679543 (DOI)000465120800017 ()2-s2.0-85064611063 (Scopus ID)9781538692301 (ISBN)
Conference
International Japan-Africa Conference on Electronics, Communications and Computations JAC-ECC, 16 Dec 2018, Alexandria, Egypt
Projects
DPAC - Dependable Platforms for Autonomous systems and Control
Available from: 2018-12-28 Created: 2018-12-28 Last updated: 2019-05-09Bibliographically approved
Ghaderi, A., Daneshtalab, M., Ashjaei, S. M., Loni, M., Mubeen, S. & Sjödin, M. (2019). Design challenges in hardware development of time-sensitive networking: A research plan. In: CEUR Workshop Proceedings, Volume 2457: . Paper presented at 2019 Cyber-Physical Systems PhD Workshop, CPSWS 2019; Alghero; Italy; 23 September 2019. CEUR-WS, 2457
Open this publication in new window or tab >>Design challenges in hardware development of time-sensitive networking: A research plan
Show others...
2019 (English)In: CEUR Workshop Proceedings, Volume 2457, CEUR-WS , 2019, Vol. 2457Conference paper, Published paper (Refereed)
Abstract [en]

Time-Sensitive Networking (TSN) is a set of ongoing projects within the IEEE standardization to guarantee timeliness and low-latency communication based on switched Ethernet for industrial applications. The huge demand is mainly coming from industries where intensive data transmission is required, such as in the modern vehicles where cameras, lidars and high-bandwidth modern sensors are connected. The TSN standards are evolving over time, hence the hardware needs to change depending upon the modifications. In addition, high performance hardware is required to obtain a full benefit from the standards. In this paper, we present a research plan for developing novel techniques to support a parameterized and modular hardware IP core of the multi-stage TSN switch fabric in VHSIC (Very High Speed Integrated Circuit) Hardware Description Language (VHDL), which can be deployed in any Field-Programmable-Gate-Array (FPGA) devices. We present the challenges on the way towards the mentioned goal. 

Place, publisher, year, edition, pages
CEUR-WS, 2019
Series
CEUR Workshop Proceedings, ISSN 1613-0073 ; 2457
Keywords
FPGA, Memory management, Predictability, Time-sensitive network, Cyber Physical System, Embedded systems, Field programmable gate arrays (FPGA), Integrated circuit design, Vehicle transmissions, Design challenges, Hardware development, High-performance hardware, Low-latency communication, Switched ethernet, Very high speed integrated circuits, Computer hardware description languages
National Category
Computer Engineering Embedded Systems
Identifiers
urn:nbn:se:mdh:diva-45837 (URN)2-s2.0-85073187187 (Scopus ID)
Conference
2019 Cyber-Physical Systems PhD Workshop, CPSWS 2019; Alghero; Italy; 23 September 2019
Available from: 2019-10-25 Created: 2019-10-25 Last updated: 2019-12-18Bibliographically approved
Mubeen, S., Ashjaei, S. M. & Nolin, M. (2019). Holistic modeling of time sensitive networking in component-based vehicular embedded systems. In: Euromicro Conference on Software Engineering and Advanced Applications SEAA 2019: . Paper presented at 45th Euromicro Conference on Software Engineering and Advanced Applications, SEAA 2019; Kallithea, Chalkidiki; Greece; 28 August 2019 through 30 August 2019 (pp. 131-139). Institute of Electrical and Electronics Engineers Inc., Article ID 8906692.
Open this publication in new window or tab >>Holistic modeling of time sensitive networking in component-based vehicular embedded systems
2019 (English)In: Euromicro Conference on Software Engineering and Advanced Applications SEAA 2019, Institute of Electrical and Electronics Engineers Inc. , 2019, p. 131-139, article id 8906692Conference paper, Published paper (Refereed)
Abstract [en]

This paper presents the first holistic modeling approach for Time-Sensitive Networking (TSN) communication that integrates into a model- and component-based software development framework for distributed embedded systems. Based on these new models, we also present an end-to-end timing model for TSN-interconnected distributed embedded systems. Our approach is expressive enough to model the timing information of TSN and the timing behaviour of software that communicates over TSN, hence allowing end-to-end timing analysis. A proof of concept for the proposed approach is provided by implementing it for a component model and tool suite used in the vehicle industry. Moreover, a use case from the vehicle industry is modeled and analyzed with the proposed approach to demonstrate its usability.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers Inc., 2019
National Category
Engineering and Technology Computer Systems
Identifiers
urn:nbn:se:mdh:diva-43938 (URN)10.1109/SEAA.2019.00029 (DOI)2-s2.0-85075976368 (Scopus ID)
Conference
45th Euromicro Conference on Software Engineering and Advanced Applications, SEAA 2019; Kallithea, Chalkidiki; Greece; 28 August 2019 through 30 August 2019
Projects
DESTINE: Developing Predictable Vehicle Software Utilizing Time Sensitive Networking
Available from: 2019-06-20 Created: 2019-06-20 Last updated: 2020-02-04Bibliographically approved
Loni, M., Zoljodi, A., Seenan, S., Daneshtalab, M. & Nolin, M. (2019). NeuroPower: Designing Energy Efficient Convolutional Neural Network Architecture for Embedded Systems. In: Lecture Notes in Computer Science, Volume 11727: . Paper presented at The 28th International Conference on Artificial Neural Networks ICANN 2019, 17 Sep 2019, Munich, Germany (pp. 208-222). Munich, Germany: Springer
Open this publication in new window or tab >>NeuroPower: Designing Energy Efficient Convolutional Neural Network Architecture for Embedded Systems
Show others...
2019 (English)In: Lecture Notes in Computer Science, Volume 11727, Munich, Germany: Springer , 2019, p. 208-222Conference paper, Published paper (Refereed)
Abstract [en]

Convolutional Neural Networks (CNNs) suffer from energy-hungry implementation due to their computation and memory intensive processing patterns. This problem is even more significant by the proliferation of CNNs on embedded platforms. To overcome this problem, we offer NeuroPower as an automatic framework that designs a highly optimized and energy efficient set of CNN architectures for embedded systems. NeuroPower explores and prunes the design space to find improved set of neural architectures. Toward this aim, a multi-objective optimization strategy is integrated to solve Neural Architecture Search (NAS) problem by near-optimal tuning network hyperparameters. The main objectives of the optimization algorithm are network accuracy and number of parameters in the network. The evaluation results show the effectiveness of NeuroPower on energy consumption, compacting rate and inference time compared to other cutting-edge approaches. In comparison with the best results on CIFAR-10/CIFAR-100 datasets, a generated network by NeuroPower presents up to 2.1x/1.56x compression rate, 1.59x/3.46x speedup and 1.52x/1.82x power saving while loses 2.4%/-0.6% accuracy, respectively.

Place, publisher, year, edition, pages
Munich, Germany: Springer, 2019
Series
Lecture Notes in Computer Science, ISSN 0302-9743 ; 11727
Keywords
Convolutional neural networks (CNNs), Neural Architecture Search (NAS), Embedded Systems, Multi-Objective Optimization
National Category
Engineering and Technology Computer Systems
Identifiers
urn:nbn:se:mdh:diva-45043 (URN)10.1007/978-3-030-30487-4_17 (DOI)2-s2.0-85072863572 (Scopus ID)9783030304867 (ISBN)
Conference
The 28th International Conference on Artificial Neural Networks ICANN 2019, 17 Sep 2019, Munich, Germany
Projects
DPAC - Dependable Platforms for Autonomous systems and ControlDeepMaker: Deep Learning Accelerator on Commercial Programmable Devices
Available from: 2019-08-23 Created: 2019-08-23 Last updated: 2019-10-17Bibliographically approved
Danielsson, J., Marcus, J., Seceleanu, T., Behnam, M. & Sjödin, M. (2019). Run-time Cache-Partition Controller for Multi-core Systems. In: : . Paper presented at In 45th Annual Conference of the IEEE Industrial Electronics Society (IECON), 2019.
Open this publication in new window or tab >>Run-time Cache-Partition Controller for Multi-core Systems
Show others...
2019 (English)Conference paper, Published paper (Refereed)
National Category
Computer Systems
Identifiers
urn:nbn:se:mdh:diva-45949 (URN)
Conference
In 45th Annual Conference of the IEEE Industrial Electronics Society (IECON), 2019
Available from: 2019-11-11 Created: 2019-11-11 Last updated: 2019-11-11Bibliographically approved
Tsog, N., Becker, M., Bruhn, F., Behnam, M. & Nolin, M. (2019). Static Allocation of Parallel Tasks to Improve Schedulability in CPU-GPU Heterogeneous Real-Time Systems. In: : . Paper presented at IEEE 45th Annual Conference of the Industrial Electronics Society, IECON2019.
Open this publication in new window or tab >>Static Allocation of Parallel Tasks to Improve Schedulability in CPU-GPU Heterogeneous Real-Time Systems
Show others...
2019 (English)Conference paper, Published paper (Refereed)
Abstract [en]

Autonomous driving is one of the main challenges of modern cars. Computer visions and intelligent on-board decision making are crucial in autonomous driving and require heterogeneous processors with high computing capability under low power consumption constraints. The progress of parallel computing using heterogeneous processing units is further supported by software frameworks like OpenCL, OpenMP, CUDA, and C++AMP. These frameworks allow the allocation of parallel computation on different compute resources. This, however, creates a difficulty in allocating the right computation segments to the right processing units in such a way that the complete system meets all its timing requirements. In this paper, we consider pre-runtime static allocations of parallel tasks to perform their execution either sequentially on CPU or in parallel using a GPU. This allows for improving any unbalanced use of GPU accelerators in a heterogeneous environment. By performing several heuristic algorithms, we show that the overuse of accelerators results in a bottle-neck of the entire system execution. The experimental results show that our allocation schemes that target a balanced use of GPU improve the system schedulability up to 90%.

Keywords
Parallel task, Parallel segment, Alternative execution, CPU-GPU, Heterogeneous processors, Real-time systems
National Category
Computer Systems
Identifiers
urn:nbn:se:mdh:diva-45934 (URN)
Conference
IEEE 45th Annual Conference of the Industrial Electronics Society, IECON2019
Projects
DPAC - Dependable Platforms for Autonomous systems and Control
Available from: 2019-11-11 Created: 2019-11-11 Last updated: 2019-12-13Bibliographically approved
Danielsson, J., Seceleanu, T., Marcus, J., Behnam, M. & Sjödin, M. (2019). Testing Performance-Isolation in Multi-Core Systems. In: : . Paper presented at 43rd IEEE Annual Computer Software and Applications Conference, COMPSAC 2019; Milwaukee; United States; 15 July 2019 through 19 July 2019 (pp. 604-609). , Article ID 8754208.
Open this publication in new window or tab >>Testing Performance-Isolation in Multi-Core Systems
Show others...
2019 (English)Conference paper, Published paper (Refereed)
Abstract [en]

In this paper we present a methodology to be used for quantifying the level of performance isolation for a multi-core system. We have devised a test that can be applied to breaches of isolation in different computing resources that may be shared between different cores. We use this test to determine the level of isolation gained by using the Jailhouse hypervisor compared to a regular Linux system in terms of CPU isolation, cache isolation and memory bus isolation. Our measurements show that the Jailhouse hypervisor provides performance isolation of local computing resources such as CPU. We have also evaluated if any isolation could be gained for shared computing resources such as the system wide cache and the memory bus controller. Our tests show no measurable difference in partitioning between a regular Linux system and a Jailhouse partitioned system for shared resources. Using the Jailhouse hypervisor provides only a small noticeable overhead when executing multiple shared-resource intensive tasks on multiple cores, which implies that running Jailhouse in a memory saturated system will not be harmful. However, contention still exist in the memory bus and in the system-wide cache.

National Category
Computer Systems
Identifiers
urn:nbn:se:mdh:diva-45947 (URN)10.1109/COMPSAC.2019.00092 (DOI)2-s2.0-85072706762 (Scopus ID)978-1-7281-2607-4 (ISBN)
Conference
43rd IEEE Annual Computer Software and Applications Conference, COMPSAC 2019; Milwaukee; United States; 15 July 2019 through 19 July 2019
Available from: 2019-11-11 Created: 2019-11-11 Last updated: 2019-12-17Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0001-7586-0409

Search in DiVA

Show all publications