mdh.sePublications
Change search
Link to record
Permanent link

Direct link
BETA
Alternative names
Publications (10 of 182) Show all publications
Tsog, N., Sjödin, M. & Bruhn, F. (2019). A Trade-Off between Computing Power and Energy Consumption of On-Board Data Processing in GPU Accelerated Real-Time Systems. In: : . Paper presented at The 32nd International Symposium on Space Technology and Science, Fukui, Japan.
Open this publication in new window or tab >>A Trade-Off between Computing Power and Energy Consumption of On-Board Data Processing in GPU Accelerated Real-Time Systems
2019 (English)Conference paper, Published paper (Refereed)
Abstract [en]

On-board data processing is one of the prior on-orbit activities that it improves the performance capability of in-orbit space systems such as deep-space exploration, earth and atmospheric observation satellites, and CubeSat constellations. However, on-board data processing encounters with higher energy consumption compared to traditional space systems. Because traditional space systems employ simple processing units such as micro-controllers or a single-core processor as the systems require no heavy data processing on orbit. Moreover, solving the radiation hardness problem is crucial in space and adopting a new processing unit is challenging.

In this paper, we consider a GPU accelerated real-time system for on-board data processing. According to prior works, there exist radiation-tolerant GPU, and the computing capability of systems is improved by using heterogeneous computing method. We conduct experimental observations of power consumption and computing potential using this heterogeneous computing method in our GPU accelerated real-time system.The results show that the proper use of GPU increases computing potential with 10-140 times and consumes between 8-130 times less energy. Furthermore, the entire task system consumes 10-65% of less energy compared to the traditional use of processing units.

Keywords
Trade-off, Computing power, Energy consumption, on-board data processing, GPU acceleration, Real-time systems
National Category
Engineering and Technology Computer Systems
Identifiers
urn:nbn:se:mdh:diva-45938 (URN)
Conference
The 32nd International Symposium on Space Technology and Science, Fukui, Japan
Projects
DPAC - Dependable Platforms for Autonomous systems and Control
Available from: 2019-11-11 Created: 2019-11-11 Last updated: 2019-11-22Bibliographically approved
Loni, M., Hamouachy, F., Casarrubios, C., Daneshtalab, M. & Nolin, M. (2019). AutoRIO: An Indoor Testbed for Developing Autonomous Vehicles. In: International Japan-Africa Conference on Electronics, Communications and Computations JAC-ECC: . Paper presented at International Japan-Africa Conference on Electronics, Communications and Computations JAC-ECC, 16 Dec 2018, Alexandria, Egypt (pp. 69-72).
Open this publication in new window or tab >>AutoRIO: An Indoor Testbed for Developing Autonomous Vehicles
Show others...
2019 (English)In: International Japan-Africa Conference on Electronics, Communications and Computations JAC-ECC, 2019, p. 69-72Conference paper, Published paper (Refereed)
Abstract [en]

Autonomous vehicles have a great influence on our life. These vehicles are more convenient, more energy efficient providing higher safety level and cheaper driving solutions. In addition, decreasing the generation of CO 2 , and the risk vehicular accidents are other benefits of autonomous vehicles. However, leveraging a full autonomous system is challenging and the proposed solutions are newfound. Providing a testbed for evaluating new algorithms is beneficial for researchers and hardware developers to verify the real impact of their solutions. The existence of testing environment is a low-cost infrastructure leading to increase the time-to-market of novel ideas. In this paper, we propose Auto Rio, a cutting-edge indoor testbed for developing autonomous vehicles.

National Category
Engineering and Technology Computer Systems
Identifiers
urn:nbn:se:mdh:diva-42236 (URN)10.1109/JEC-ECC.2018.8679543 (DOI)000465120800017 ()2-s2.0-85064611063 (Scopus ID)9781538692301 (ISBN)
Conference
International Japan-Africa Conference on Electronics, Communications and Computations JAC-ECC, 16 Dec 2018, Alexandria, Egypt
Projects
DPAC - Dependable Platforms for Autonomous systems and Control
Available from: 2018-12-28 Created: 2018-12-28 Last updated: 2019-05-09Bibliographically approved
Ghaderi, A., Daneshtalab, M., Ashjaei, S. M., Loni, M., Mubeen, S. & Sjödin, M. (2019). Design challenges in hardware development of time-sensitive networking: A research plan. In: CEUR Workshop Proceedings: . Paper presented at 2019 Cyber-Physical Systems PhD Workshop, CPSWS 2019; Alghero; Italy; 23 September 2019. CEUR-WS, 2457
Open this publication in new window or tab >>Design challenges in hardware development of time-sensitive networking: A research plan
Show others...
2019 (English)In: CEUR Workshop Proceedings, CEUR-WS , 2019, Vol. 2457Conference paper, Published paper (Refereed)
Abstract [en]

Time-Sensitive Networking (TSN) is a set of ongoing projects within the IEEE standardization to guarantee timeliness and low-latency communication based on switched Ethernet for industrial applications. The huge demand is mainly coming from industries where intensive data transmission is required, such as in the modern vehicles where cameras, lidars and high-bandwidth modern sensors are connected. The TSN standards are evolving over time, hence the hardware needs to change depending upon the modifications. In addition, high performance hardware is required to obtain a full benefit from the standards. In this paper, we present a research plan for developing novel techniques to support a parameterized and modular hardware IP core of the multi-stage TSN switch fabric in VHSIC (Very High Speed Integrated Circuit) Hardware Description Language (VHDL), which can be deployed in any Field-Programmable-Gate-Array (FPGA) devices. We present the challenges on the way towards the mentioned goal. 

Place, publisher, year, edition, pages
CEUR-WS, 2019
Keywords
FPGA, Memory management, Predictability, Time-sensitive network, Cyber Physical System, Embedded systems, Field programmable gate arrays (FPGA), Integrated circuit design, Vehicle transmissions, Design challenges, Hardware development, High-performance hardware, Low-latency communication, Switched ethernet, Very high speed integrated circuits, Computer hardware description languages
National Category
Computer Engineering Embedded Systems
Identifiers
urn:nbn:se:mdh:diva-45837 (URN)2-s2.0-85073187187 (Scopus ID)
Conference
2019 Cyber-Physical Systems PhD Workshop, CPSWS 2019; Alghero; Italy; 23 September 2019
Available from: 2019-10-25 Created: 2019-10-25 Last updated: 2019-10-25
Loni, M., Zoljodi, A., Seenan, S., Daneshtalab, M. & Nolin, M. (2019). NeuroPower: Designing Energy Efficient Convolutional Neural Network Architecture for Embedded Systems. In: Lecture Notes in Computer Science, Volume 11727: . Paper presented at The 28th International Conference on Artificial Neural Networks ICANN 2019, 17 Sep 2019, Munich, Germany (pp. 208-222). Munich, Germany: Springer
Open this publication in new window or tab >>NeuroPower: Designing Energy Efficient Convolutional Neural Network Architecture for Embedded Systems
Show others...
2019 (English)In: Lecture Notes in Computer Science, Volume 11727, Munich, Germany: Springer , 2019, p. 208-222Conference paper, Published paper (Refereed)
Abstract [en]

Convolutional Neural Networks (CNNs) suffer from energy-hungry implementation due to their computation and memory intensive processing patterns. This problem is even more significant by the proliferation of CNNs on embedded platforms. To overcome this problem, we offer NeuroPower as an automatic framework that designs a highly optimized and energy efficient set of CNN architectures for embedded systems. NeuroPower explores and prunes the design space to find improved set of neural architectures. Toward this aim, a multi-objective optimization strategy is integrated to solve Neural Architecture Search (NAS) problem by near-optimal tuning network hyperparameters. The main objectives of the optimization algorithm are network accuracy and number of parameters in the network. The evaluation results show the effectiveness of NeuroPower on energy consumption, compacting rate and inference time compared to other cutting-edge approaches. In comparison with the best results on CIFAR-10/CIFAR-100 datasets, a generated network by NeuroPower presents up to 2.1x/1.56x compression rate, 1.59x/3.46x speedup and 1.52x/1.82x power saving while loses 2.4%/-0.6% accuracy, respectively.

Place, publisher, year, edition, pages
Munich, Germany: Springer, 2019
Series
Lecture Notes in Computer Science, ISSN 0302-9743 ; 11727
Keywords
Convolutional neural networks (CNNs), Neural Architecture Search (NAS), Embedded Systems, Multi-Objective Optimization
National Category
Engineering and Technology Computer Systems
Identifiers
urn:nbn:se:mdh:diva-45043 (URN)10.1007/978-3-030-30487-4_17 (DOI)2-s2.0-85072863572 (Scopus ID)9783030304867 (ISBN)
Conference
The 28th International Conference on Artificial Neural Networks ICANN 2019, 17 Sep 2019, Munich, Germany
Projects
DPAC - Dependable Platforms for Autonomous systems and ControlDeepMaker: Deep Learning Accelerator on Commercial Programmable Devices
Available from: 2019-08-23 Created: 2019-08-23 Last updated: 2019-10-17Bibliographically approved
Danielsson, J., Marcus, J., Seceleanu, T., Behnam, M. & Sjödin, M. (2019). Run-time Cache-Partition Controller for Multi-core Systems. In: : . Paper presented at In 45th Annual Conference of the IEEE Industrial Electronics Society (IECON), 2019.
Open this publication in new window or tab >>Run-time Cache-Partition Controller for Multi-core Systems
Show others...
2019 (English)Conference paper, Published paper (Refereed)
National Category
Computer Systems
Identifiers
urn:nbn:se:mdh:diva-45949 (URN)
Conference
In 45th Annual Conference of the IEEE Industrial Electronics Society (IECON), 2019
Available from: 2019-11-11 Created: 2019-11-11 Last updated: 2019-11-11Bibliographically approved
Tsog, N., Becker, M., Bruhn, F., Behnam, M. & Nolin, M. (2019). Static Allocation of Parallel Tasks to Improve Schedulability in CPU-GPU Heterogeneous Real-Time Systems. In: : . Paper presented at IEEE 45th Annual Conference of the Industrial Electronics Society, IECON2019.
Open this publication in new window or tab >>Static Allocation of Parallel Tasks to Improve Schedulability in CPU-GPU Heterogeneous Real-Time Systems
Show others...
2019 (English)Conference paper, Published paper (Refereed)
Abstract [en]

Autonomous driving is one of the main challenges of modern cars. Computer visions and intelligent on-board decision making are crucial in autonomous driving and require heterogeneous processors with high computing capability under low power consumption constraints. The progress of parallel computing using heterogeneous processing units is further supported by software frameworks like OpenCL, OpenMP, CUDA, and C++AMP. These frameworks allow the allocation of parallel computation on different compute resources. This, however, creates a difficulty in allocating the right computation segments to the right processing units in such a way that the complete system meets all its timing requirements. In this paper, we consider pre-runtime static allocations of parallel tasks to perform their execution either sequentially on CPU or in parallel using a GPU. This allows for improving any unbalanced use of GPU accelerators in a heterogeneous environment. By performing several heuristic algorithms, we show that the overuse of accelerators results in a bottle-neck of the entire system execution. The experimental results show that our allocation schemes that target a balanced use of GPU improve the system schedulability up to 90%.

Keywords
Parallel task, Parallel segment, Alternative execution, CPU-GPU, Heterogeneous processors, Real-time systems
National Category
Computer Systems
Identifiers
urn:nbn:se:mdh:diva-45934 (URN)
Conference
IEEE 45th Annual Conference of the Industrial Electronics Society, IECON2019
Projects
DPAC - Dependable Platforms for Autonomous systems and Control
Available from: 2019-11-11 Created: 2019-11-11 Last updated: 2019-11-11
Danielsson, J., Seceleanu, T., Marcus, J., Behnam, M. & Sjödin, M. (2019). Testing Performance-Isolation in Multi-Core Systems. In: : . Paper presented at 43rd IEEE Annual Computer Software and Applications Conference, COMPSAC 2019; Milwaukee; United States; 15 July 2019 through 19 July 2019 (pp. 604-609). , Article ID 8754208.
Open this publication in new window or tab >>Testing Performance-Isolation in Multi-Core Systems
Show others...
2019 (English)Conference paper, Published paper (Refereed)
Abstract [en]

In this paper we present a methodology to be used for quantifying the level of performance isolation for a multi-core system. We have devised a test that can be applied to breaches of isolation in different computing resources that may be shared between different cores. We use this test to determine the level of isolation gained by using the Jailhouse hypervisor compared to a regular Linux system in terms of CPU isolation, cache isolation and memory bus isolation. Our measurements show that the Jailhouse hypervisor provides performance isolation of local computing resources such as CPU. We have also evaluated if any isolation could be gained for shared computing resources such as the system wide cache and the memory bus controller. Our tests show no measurable difference in partitioning between a regular Linux system and a Jailhouse partitioned system for shared resources. Using the Jailhouse hypervisor provides only a small noticeable overhead when executing multiple shared-resource intensive tasks on multiple cores, which implies that running Jailhouse in a memory saturated system will not be harmful. However, contention still exist in the memory bus and in the system-wide cache.

National Category
Computer Systems
Identifiers
urn:nbn:se:mdh:diva-45947 (URN)10.1109/COMPSAC.2019.00092 (DOI)978-1-7281-2607-4 (ISBN)
Conference
43rd IEEE Annual Computer Software and Applications Conference, COMPSAC 2019; Milwaukee; United States; 15 July 2019 through 19 July 2019
Available from: 2019-11-11 Created: 2019-11-11 Last updated: 2019-11-11Bibliographically approved
Nazari, N., Loni, M., E. Salehi, M., Daneshtalab, M. & Sjödin, M. (2019). TOT-Net: An Endeavor Toward Optimizing Ternary Neural Networks. In: 22nd Euromicro Conference on Digital System Design DSD 2019: . Paper presented at 22nd Euromicro Conference on Digital System Design DSD 2019, 28 Aug 2019, Chalkidiki, Greece (pp. 305-312). , Article ID 8875067,.
Open this publication in new window or tab >>TOT-Net: An Endeavor Toward Optimizing Ternary Neural Networks
Show others...
2019 (English)In: 22nd Euromicro Conference on Digital System Design DSD 2019, 2019, p. 305-312, article id 8875067Conference paper, Published paper (Refereed)
Abstract [en]

High computation demands and big memory resources are the major implementation challenges of Convolutional Neural Networks (CNNs) especially for low-power and resource-limited embedded devices. Many binarized neural networks are recently proposed to address these issues. Although they have significantly decreased computation and memory footprint, they have suffered from accuracy loss especially for large datasets. In this paper, we propose TOT-Net, a ternarized neural network with [-1, 0, 1] values for both weights and activation functions that has simultaneously achieved a higher level of accuracy and less computational load. In fact, first, TOT-Net introduces a simple bitwise logic for convolution computations to reduce the cost of multiply operations. To improve the accuracy, selecting proper activation function and learning rate are influential, but also difficult. As the second contribution, we propose a novel piece-wise activation function, and optimized learning rate for different datasets. Our findings first reveal that 0.01 is a preferable learning rate for the studied datasets. Third, by using an evolutionary optimization approach, we found novel piece-wise activation functions customized for TOT-Net. According to the experimental results, TOT-Net achieves 2.15%, 8.77%, and 5.7/5.52% better accuracy compared to XNOR-Net on CIFAR-10, CIFAR-100, and ImageNet top-5/top-1 datasets, respectively.

Keywords
convolutional neural networks, ternary neural network, activation function, optimization
National Category
Engineering and Technology Computer Systems
Identifiers
urn:nbn:se:mdh:diva-45042 (URN)2-s2.0-85074915397 (Scopus ID)
Conference
22nd Euromicro Conference on Digital System Design DSD 2019, 28 Aug 2019, Chalkidiki, Greece
Projects
DPAC - Dependable Platforms for Autonomous systems and ControlDeepMaker: Deep Learning Accelerator on Commercial Programmable Devices
Available from: 2019-08-23 Created: 2019-08-23 Last updated: 2019-11-21Bibliographically approved
Tsog, N., Nolin, M. & Bruhn, F. (2019). Using Docker in Process Level Isolation for Heterogeneous Computing on GPU Accelerated On-Board Data Processing Systems. In: : . Paper presented at 12th IAA Symposium on Small Satellites for Earth Observation, Berlin, Germany.
Open this publication in new window or tab >>Using Docker in Process Level Isolation for Heterogeneous Computing on GPU Accelerated On-Board Data Processing Systems
2019 (English)Conference paper, Published paper (Refereed)
Abstract [en]

The technological advancements make the intelligent on-board data processing possible on a small scale of satellites and deep-space exploration spacecraft such as CubeSats. However, the operation of satellites may fall into critical conditions when the on-board data processing interferes strongly to the basic operation functionalities of satellites. In order to avoid these issues, there exist techniques such as isolation, partitioning, and virtualization. In this paper, we present an experimental study of isolation of on-board payload data processing from the basic operations of satellites using Docker. Docker is a leading technology in process level isolation as well as continuous integration and continuous deployment (CI/CD) method. This study continues with the prior study on heterogeneous computing method, which improves the schedulability of the entire system up to 90%. Based on this heterogeneous computing method, the comparison study has been conducted between the non-isolated and isolated environments.

Keywords
Process level isolation, Docker, On-board data processing, Heterogeneous computing, cgroups, Linux
National Category
Engineering and Technology Computer Systems
Identifiers
urn:nbn:se:mdh:diva-45939 (URN)
Conference
12th IAA Symposium on Small Satellites for Earth Observation, Berlin, Germany
Projects
DPAC - Dependable Platforms for Autonomous systems and Control
Available from: 2019-11-11 Created: 2019-11-11 Last updated: 2019-11-11
Tsog, N., Sjödin, M. & Bruhn, F. (2019). Using Heterogeneous Computing on GPU Accelerated Systems to Advance On-Board Data Processing. In: European Workshop on On-Board Data Processing 2019 OBDP2019: . Paper presented at European Workshop on On-Board Data Processing 2019 OBDP2019, 25 Feb 2019, Amsterdam, Netherlands.
Open this publication in new window or tab >>Using Heterogeneous Computing on GPU Accelerated Systems to Advance On-Board Data Processing
2019 (English)In: European Workshop on On-Board Data Processing 2019 OBDP2019, 2019Conference paper, Published paper (Refereed)
Keywords
Heterogeneous Computing, GPU accelerated On-Board Data Processing, Advanced On-Board Data Processing
National Category
Engineering and Technology Computer Systems
Identifiers
urn:nbn:se:mdh:diva-45490 (URN)
Conference
European Workshop on On-Board Data Processing 2019 OBDP2019, 25 Feb 2019, Amsterdam, Netherlands
Projects
DPAC - Dependable Platforms for Autonomous systems and Control
Available from: 2019-10-29 Created: 2019-10-29 Last updated: 2019-10-29Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0001-7586-0409

Search in DiVA

Show all publications