https://www.mdu.se/

mdu.sePublications
Change search
Refine search result
12 51 - 92 of 92
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 51.
    Mahdiani, H.
    et al.
    School of Electrical and Computer Engineering, College of Engineering, University of Tehran, Tehran, Iran.
    Khadem, A.
    Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, United States.
    Yasoubi, A.
    Ghanbari, A.
    School of Electrical and Computer Engineering, College of Engineering, University of Tehran, Tehran, Iran.
    Modarressi, M.
    School of Electrical and Computer Engineering, College of Engineering, University of Tehran, Tehran, Iran.
    Daneshtalab, Masoud
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Computation reuse-aware accelerator for neural networks2020In: Hardware Architectures for Deep Learning, Institution of Engineering and Technology , 2020, p. 147-158Chapter in book (Other academic)
    Abstract [en]

    Power consumption has long been a significant concern in neural networks. In particular, large neural networks that implement novel machine learning techniques require much more computation, and hence power, than ever before. In this chapter, we showed that computation reuse could exploit the inherent redundancy in the arithmetic operations of the neural network to save power. Experimental results showed that computation reuse, when coupled with the approximation property of neural networks, can eliminate up to 90% of multiplication, effectively reducing power consumption by 61%, on average in the presented architecture. The proposed computation reuse -aware design can be extended in several ways. First, it can be integrated into several state-of-the-art customized architectures for LSTM, spiking, and convolutional neural network models to further reduce power consumption. Second, we can couple computation reuse with existing mapping and scheduling algorithms toward developing reusable scheduling and mapping methods for neural network. Computation reuse can also boost the performance of the methods that eliminate ineffectual computations in deep learning neural networks. Evaluating the impact of CORN on reliability and customizing the CORN architecture for FPGA-based neural network implementation are the other future works in this line.

  • 52.
    Mahdiani, Hoda
    et al.
    Univ Tehran, Dept Elect & Comp Engn, Comp Engn, Tehran, Iran..
    Khadem, Alireza
    Univ Tehran, Tehran, Iran..
    Ghanbari, Azam
    Univ Tehran, Dept Elect & Comp Engn, Comp Engn, Tehran, Iran..
    Modarressi, Mehdi
    Univ Tehran, Coll Engn, Dept Elect & Comp Engn, Tehran, Iran..
    Fattahi-Bayat, Farima
    Univ Tehran, Tehran, Iran..
    Daneshtalab, Masoud
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    ΔnN: Power-Efficient Neural Network Acceleration Using Differential Weights2020In: IEEE Micro, ISSN 0272-1732, E-ISSN 1937-4143, Vol. 40, no 1, p. 67-74Article in journal (Refereed)
    Abstract [en]

    The enormous and ever-increasing complexity of state-of-the-art neural networks has impeded the deployment of deep learning on resource-limited embedded and mobile devices. To reduce the complexity of neural networks, this article presents Delta NN, a power-efficient architecture that leverages a combination of the approximate value locality of neuron weights and algorithmic structure of neural networks. Delta NN keeps each weight as its difference (Delta) to the nearest smaller weight: each weight reuses the calculations of the smaller weight, followed by a calculation on the Delta value to make up the difference. We also round up/down the Delta to the closest power of two numbers to further reduce complexity. The experimental results show that Delta NN boosts the average performance by 14%-37% and reduces the average power consumption by 17%-49% over some state-of-the-art neural network designs.

  • 53.
    Majd, A.
    et al.
    Abo Akademi University, Turku, Finland.
    Ashraf, A.
    Abo Akademi University, Turku, Finland.
    Troubitsyna, E.
    Abo Akademi University, Turku, Finland.
    Daneshtalab, Masoud
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Integrating Learning, Optimization, and Prediction for Efficient Navigation of Swarms of Drones2018In: Proceedings - 26th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing, PDP 2018, Institute of Electrical and Electronics Engineers Inc. , 2018, p. 101-108Conference paper (Refereed)
    Abstract [en]

    Swarms of drones are increasingly been used in a variety of monitoring and surveillance, search and rescue, and photography and filming tasks. However, despite the growing popularity of swarm-based applications of drones, there is still a lack of approaches to generate efficient drone routes while minimizing the risks of drone collisions. In this paper, we present a novel approach that integrates learning, optimization, and prediction for generating efficient and safe routes for swarms of drones. The proposed approach comprises three main components: (1) a high-performance dynamic evolutionary algorithm for optimizing drone routes, (2) a reinforcement learning algorithm for incorporating the feedback and runtime data about the system state, and (3) a prediction approach to predict the movement of drones and moving obstacles in the flying zone. We also present a parallel implementation of the proposed approach and evaluate it against two benchmarks. The results demonstrate that the proposed approach allows to significantly reduce the route lengths and computation overhead while producing efficient and safe routes. 

  • 54.
    Majd, A.
    et al.
    Faculty of Natural Sciences and Technology, Åbo Akademi University, Turku, Finland.
    Ashraf, A.
    Faculty of Natural Sciences and Technology, Åbo Akademi University, Turku, Finland.
    Troubitsyna, E.
    Faculty of Natural Sciences and Technology, Åbo Akademi University, Turku, Finland.
    Daneshtalab, Masoud
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Using Optimization, Learning, and Drone Reflexes to Maximize Safety of Swarms of Drones2018In: 2018 IEEE Congress on Evolutionary Computation, CEC 2018 - Proceedings, Institute of Electrical and Electronics Engineers Inc. , 2018Conference paper (Refereed)
    Abstract [en]

    Despite the growing popularity of swarm-based applications of drones, there is still a lack of approaches to maximize the safety of swarms of drones by minimizing the risks of drone collisions. In this paper, we present an approach that uses optimization, learning, and automatic immediate responses (reflexes) of drones to ensure safe operations of swarms of drones. The proposed approach integrates a high-performance dynamic evolutionary algorithm and a reinforcement learning algorithm to generate safe and efficient drone routes and then augments the generated routes with dynamically computed drone reflexes to prevent collisions with unforeseen obstacles in the flying zone. We also present a parallel implementation of the proposed approach and evaluate it against two benchmarks. The results show that the proposed approach maximizes safety and generates highly efficient drone routes.

  • 55.
    Majd, A.
    et al.
    Åbo Akademi University, Turku, Finland.
    Daneshtalab, Masoud
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Troubitsyna, E.
    Åbo Akademi University, Turku, Finland.
    Sahebi, G.
    University of Turku, Turku, Finland.
    Optimal smart mobile access point placement for maximal coverage and minimal communication2017In: CM International Conference Proceeding Series, Volume Part F130524, Association for Computing Machinery , 2017, Vol. Part F130524, article id a21Conference paper (Refereed)
    Abstract [en]

    A selection of the optimal placements of the access points and sensors constitutes one of the fundamental challenges in the monitoring of spatial phenomena in wireless sensor networks (WSNs). Access points should occupy the best locations in order to obtain a sufficient degree of coverage with a low communication cost. Finding an optimal placement is an NP-hard problem that is further complicated by the real-world conditions such as obstacles, radiation interference etc. In this paper, we propose a compound method to select the best near-optimal placement of SMAPs with the goal to maximize the monitoring coverage and to minimize the communication cost. Our approach combinesa parallel implementation of the Imperialist Competitive Algorithm (ICA) with a greedy method.The benchmarking of the proposed approach demonstrates its clear advantages in solving and optimizing the placement problem. 

  • 56.
    Majd, A.
    et al.
    Faculty of Science and Engineering, Åbo Akademi University, Domkyrkotorget 3, Turku, Finland.
    Loni, Mohammad
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Sahebi, G.
    Department of Future Technologies, University of Turku, Turku, Finland.
    Daneshtalab, Masoud
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Improving motion safety and efficiency of intelligent autonomous swarm of drones2020In: Drones, E-ISSN 2504-446X, Vol. 4, no 3, p. 1-19, article id 48Article in journal (Refereed)
    Abstract [en]

    Interest is growing in the use of autonomous swarms of drones in various mission-physical applications such as surveillance, intelligent monitoring, and rescue operations. Swarm systems should fulfill safety and efficiency constraints in order to guarantee dependable operations. To maximize motion safety, we should design the swarm system in such a way that drones do not collide with each other and/or other objects in the operating environment. On other hand, to ensure that the drones have sufficient resources to complete the required task reliably, we should also achieve efficiency while implementing the mission, by minimizing the travelling distance of the drones. In this paper, we propose a novel integrated approach that maximizes motion safety and efficiency while planning and controlling the operation of the swarm of drones. To achieve this goal, we propose a novel parallel evolutionary-based swarm mission planning algorithm. The evolutionary computing allows us to plan and optimize the routes of the drones at the run-time to maximize safety while minimizing travelling distance as the efficiency objective. In order to fulfill the defined constraints efficiently, our solution promotes a holistic approach that considers the whole design process from the definition of formal requirements through the software development. The results of benchmarking demonstrate that our approach improves the route efficiency by up to 10% route efficiency without any crashes in controlling swarms compared to state-of-the-art solutions. 

  • 57.
    Majd, A.
    et al.
    Åbo Akademi University, Turku, Finland.
    Sahebi, G.
    University of Turku, Turku, Finland.
    Daneshtalab, Masoud
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Troubitsyna, E.
    Åbo Akademi University, Turku, Finland.
    Optimizing scheduling for heterogeneous computing systems using combinatorial meta-heuristic solution2017In: 2017 IEEE SmartWorld Ubiquitous Intelligence and Computing, Advanced and Trusted Computed, Scalable Computing and Communications, Cloud and Big Data Computing, Internet of People and Smart City Innovation, SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI 2017 - Conference Proceedings, 2017, p. 1-8Conference paper (Refereed)
    Abstract [en]

    Today, based on fast development especially in Network-on-Chip (NoC)-based many-core systems, the task scheduling problem plays a critical role in high-performance computing. It is an NP-hard problem. The complexity increases further when the scheduling problem is applied to heterogeneous platforms. Exploring the whole search space in order to find the optimal solution is not time efficient, thus metaheuristics are mostly used to find a near-optimal solution in a reasonable amount of time. We propose a compound method to select the best near-optimal task schedule in the heterogeneous platform in order to minimize the execution time. For this, we combine a new parallel meta-heuristic method with a greedy scheme. We introduce a novel metaheuristic method for near-optimal scheduling that can provide performance guarantees for multiple applications implemented on a shared platform. Applications are modeled as directed acyclic task graphs (DAG) for execution on a heterogeneous NoC-based many-core platform with given communication costs. We introduce an order-based encoding especially for pipelined operation that improves (decreases) execution time by more than 46%. Moreover, we present a novel multi-population method inspired by both genetic and imperialist competitive algorithms specialized for the scheduling problem, improving the convergence policy and selection pressure. The potential of the approach is demonstrated by experiments using a Sobel filter, SUSAN filter, RASTA-PLP, and JPEG encoder as real-world case studies. 

  • 58.
    Majd, Amin
    et al.
    Åbo Akad Univ, Turku, Finland..
    Daneshtalab, Masoud
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Plosila, Juha
    Univ Turku, Turku, Finland..
    Moghaddami Khalilzad, Nima
    KTH Royal Inst Technol, Stockholm, Sweden..
    Sahebi, Golnaz
    Univ Turku, Turku, Finland..
    Troubitsyna, Elena
    Åbo Akad Univ, Turku, Finland..
    NOMeS: Near-Optimal Metaheuristic Scheduling for MPSoCs2017In: 2017 19TH INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND DIGITAL SYSTEMS (CADS), 2017, p. 70-75Conference paper (Refereed)
    Abstract [en]

    The task scheduling problem for Multiprocessor System-on-Chips (MPSoC), which plays a vital role in performance, is an NP-hard problem. Exploring the whole search space in order to find the optimal solution is not time efficient, thus metaheuristics are mostly used to find a near-optimal solution in a reasonable amount of time. We propose a novel metaheuristic method for near-optimal scheduling that can provide performance guarantees for multiple applications implemented on a shared platform. Applications are represented as directed acyclic task graphs (DAG) and are executed on an MPSoC platform with given communication costs. We introduce a novel multi-population method inspired by both genetic and imperialist competitive algorithms. It is specialized for the scheduling problem with the goal to improve the convergence policy and selection pressure. The potential of the approach is demonstrated by experiments using a Sobel filter, a SUSAN filter, RASTA-PLP and JPEG encoder as real-world case studies.

  • 59.
    Majd, Amin
    et al.
    Åbo Akademi, Finland.
    Loni, Mohammad
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Sahebi, Golnaz
    University of Turku, Finland.
    Daneshtalab, Masoud
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Troubitsyna, Elena
    KTH, Sweden.
    A Cloud Based Super-Optimization Method to Parallelize the Sequential Code’s Nested Loops2019In: IEEE 13th International Symposium on Embedded Multicore/Many-core Systems-on-Chip MCSoC-2019, 2019Conference paper (Refereed)
    Abstract [en]

    Advances in hardware architecture regarding multi-core processors make parallel computing ubiquitous. To achieve the maximum utilization of multi-core processors, parallel programming techniques are required. However, there are several challenges standing in front of parallel programming. These problems are mainly divided into three major groups. First, although recent advancements in parallel programming languages (e.g. MPI, OpenCL, etc.) assist developers, still parallel programming is not desirable for most programmers. The second one belongs to the massive volume of old software and applications, which have been written in serial mode. However, converting millions of line of serial codes to parallel codes is highly time-consuming and requiring huge verification effort. Third, the production of software and applications in parallel mode is very expensive since it needs knowledge and expertise. Super-optimization provided by super compilers is the process of automatically determine the dependent and independent instructions to find any data dependency and loop-free sequence of instructions. Super compiler then runs these instructions on different processors in the parallel mode, if it is possible. Super-optimization is a feasible solution for helping the programmer to get relaxed from parallel programming workload. Since the most complexity of the sequential codes is in the nested loops, we try to parallelize the nested loops by using the idea of super-optimization. One of the underlying stages in the super-optimization is scheduling tiled space for iterating nested loops. Since the problem is NP-Hard, using the traditional optimization methods are not feasible. In this paper, we propose a cloud-based super-optimization method as Software-as-a-Service (SaaS) to reduce the cost of parallel programming. In addition, it increases the utilization of the processing capacity of the multi-core processor. As the result, an intermediate programmer can use the whole processing capacity of his/her system without knowing anything about writing parallel codes or super compiler functions by sending the serial code to a cloud server and receiving the parallel version of the code from the cloud server. In this paper, an evolutionary algorithm is leveraged to solve the scheduling problem of tiles. Our proposed super-optimization method will serve as software and provided as a hybrid (public and private) deployment model.

  • 60.
    Majd, Amin
    et al.
    Abo Akad Univ, Finland..
    Sahebi, Golnaz
    Univ Turku, Finland..
    Daneshtalab, Masoud
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Plosila, Juha
    Univ Turku, Finland..
    Lotfi, Shahriar
    Univ Tabriz, Iran..
    Tenhunen, Hannu
    Univ Turku, Finland..
    Parallel imperialist competitive algorithms2018In: Concurrency and Computation, ISSN 1532-0626, E-ISSN 1532-0634, Vol. 30, no 7, article id e4393Article in journal (Refereed)
    Abstract [en]

    The importance of optimization and NP-problem solving cannot be overemphasized. The usefulness and popularity of evolutionary computing methods are also well established. There are various types of evolutionary methods; they are mostly sequential but some of them have parallel implementations as well. We propose a multi-population method to parallelize the Imperialist Competitive Algorithm. The algorithm has been implemented with the Message Passing Interface on 2 computer platforms, and we have tested our method based on shared memory and message passing architectural models. An outstanding performance is obtained, demonstrating that the proposed method is very efficient concerning both speed and accuracy. In addition, compared with a set of existing well-known parallel algorithms, our approach obtains more accurate results within a shorter time period.

  • 61.
    Majd, Amin
    et al.
    Abo Akad Univ, Dept Informat Technol, Turku, Finland..
    Troubitsyna, Elena
    Abo Akad Univ, Dept Informat Technol, Turku, Finland..
    Daneshtalab, Masoud
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Optimal Placement for Smart Mobile Access Points2018In: 2018 IEEE SMARTWORLD, UBIQUITOUS INTELLIGENCE & COMPUTING, ADVANCED & TRUSTED COMPUTING, SCALABLE COMPUTING & COMMUNICATIONS, CLOUD & BIG DATA COMPUTING, INTERNET OF PEOPLE AND SMART CITY INNOVATION (SMARTWORLD/SCALCOM/UIC/ATC/CBDCOM/IOP/SCI) / [ed] Wang, G Han, Q Bhuiyan, MZA Ma, X Loulergue, F Li, P Roveri, M Chen, L, 2018, p. 1659-1667Conference paper (Refereed)
    Abstract [en]

    Recently Smart Mobile Access Point (SMAP) based architectures have emerged as a promising solution for creating smart solutions supporting monitoring of special phenomena. SMAP allow us to predict communication activities in a system using the information collected from the network, and select the best approach to support the network at any given time. To improve the network performance, SMAPs can autonomously change their positions. They communicate with each other and carry out distributed computing tasks, constituting a mobile fog-computing platform. However, the communication cost becomes a critical factor. In this paper, we propose a compound method to select the best near-optimal placement of SMAPs with the goal to maximize the monitoring coverage and to minimize the communication cost. Our approach combines a parallel implementation of the Imperialist Competitive Algorithm (ICA) with Kruskal's Algorithm.

  • 62. Maleki, Neda
    et al.
    Loni, Mohammad
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Daneshtalab, Masoud
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Conti, Mauro
    University of Padua, Italy .
    Fotouhi, Hossein
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    SoFA: A Spark-oriented Fog Architecture2019In: IEEE 45th Annual Conference of the Industrial Electronics Society IECON'19, 2019Conference paper (Refereed)
    Abstract [en]

    Fog computing offers a wide range of service levels including low bandwidth usage, low response time, support of heterogeneous applications, and high energy efficiency. Therefore, real-time embedded applications could potentially benefit from Fog infrastructure. However, providing high system utilization is an important challenge of Fog computing especially for processing embedded applications. In addition, although Fog computing extends cloud computing by providing more energy efficiency, it still suffers from remarkable energy consumption, which is a limitation for embedded systems. To overcome the above limitations, in this paper, we propose SoFA, a Spark-oriented Fog architecture that leverages Spark functionalities to provide higher system utilization, energy efficiency, and scalability. Compared to the common Fog computing platforms where edge devices are only responsible for processing data received from their IoT nodes, SoFA leverages the remaining processing capacity of all other edge devices. To attain this purpose, SoFA provides a distributed processing paradigm by the help of Spark to utilize the whole processing capacity of all the available edge devices leading to increase energy efficiency and system utilization. In other words, SoFA proposes a near- sensor processing solution in which the edge devices act as the Fog nodes. In addition, SoFA provides scalability by taking advantage of Spark functionalities. According to the experimental results, SoFA is a power-efficient and scalable solution desirable for embedded platforms by providing up to 3.1x energy efficiency for the Word-Count benchmark compared to the common Fog processing platform.

  • 63.
    Mirsalari, S. A.
    et al.
    School of Electrical and Computer Engineering, University of Tehran, Tehran, Iran.
    Nazari, N.
    School of Electrical and Computer Engineering, University of Tehran, Tehran, Iran.
    Ansarmohammadi, S. A.
    School of Electrical and Computer Engineering, University of Tehran, Tehran, Iran.
    Sinaei, Sima
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Salehi, M. E.
    School of Electrical and Computer Engineering, University of Tehran, Tehran, Iran.
    Daneshtalab, Masoud
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    ELC-ECG: Efficient LSTM cell for ECG classification based on quantized architecture2021In: Proceedings - IEEE International Symposium on Circuits and Systems, Institute of Electrical and Electronics Engineers Inc. , 2021Conference paper (Refereed)
    Abstract [en]

    Long Short-Term Memory (LSTM) is one of the most popular and effective Recurrent Neural Network (RNN) models used for sequence learning in applications such as ECG signal classification. Complex LSTMs could hardly be deployed on resource-limited bio-medical wearable devices due to the huge amount of computations and memory requirements. Binary LSTMs are introduced to cope with this problem. However, naive binarization leads to significant accuracy loss in ECG classification. In this paper, we propose an efficient LSTM cell along with a novel hardware architecture for ECG classification. By deploying 5-level binarized inputs and just 1-level binarization for weights, output, and in-memory cell activations, the delay of one LSTM cell operation is reduced 50x with about 0.004% accuracy loss in comparison with full precision design of ECG classification.

  • 64.
    Mirsalari, S. A.
    et al.
    School of Electrical and Computer Engineering, University of Tehran, Tehran, Iran.
    Sinaei, Sima
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Salehi, M. E.
    School of Electrical and Computer Engineering, University of Tehran, Tehran, Iran.
    Daneshtalab, Masoud
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    MuBiNN: Multi-level binarized recurrent neural network for EEG signal classification2020In: Proceedings - IEEE International Symposium on Circuits and Systems, Institute of Electrical and Electronics Engineers Inc. , 2020, article id 102250Conference paper (Refereed)
    Abstract [en]

    Recurrent Neural Networks (RNN) are widely used for learning sequences in applications such as EEG classification. Complex RNNs could be hardly deployed on wearable devices due to their computation and memory-intensive processing patterns. Generally, reduction in precision leads much more efficiency and binarized RNNs are introduced as energy-efficient solutions. However, naive binarization methods lead to significant accuracy loss in EEG classification. In this paper, we propose a multi-level binarized LSTM, which significantly reduces computations whereas ensuring an accuracy pretty close to the full precision LSTM. Our method reduces the delay of the 3-bit LSTM cell operation 47× with less than 0.01% accuracy loss.

  • 65.
    Mirsalari, Seyed Ahmad
    et al.
    University of Tehran, Tehran, Iran.
    Nazari, Najmeh
    University of Tehran, Tehran, Iran.
    Sinaei, Sima
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Salehi, Mostafa E.
    University of Tehran, Tehran, Iran.
    Daneshtalab, Masoud
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems. Tallinn University of Technology, Estonia.
    FaCT-LSTM: Fast and Compact Ternary Architecture for LSTM Recurrent Neural Networks2022In: IEEE design & test, ISSN 2168-2356, E-ISSN 2168-2364, Vol. 39, no 3, p. 45-53Article in journal (Refereed)
    Abstract [en]

    Long Short-Term Memory (LSTM) achieved great success in healthcare applications. However, its extensive computation cost and massive model size have become the major obstacles for the deployment of such a powerful algorithm in resource-limited embedded systems such as wearable devices. Quantization is a promising way to reduce the memory footprint and computational cost of neural networks. Although quantization achieved remarkable success in convolutional neural networks (CNNs), it still suffers from large accuracy loss in LSTM networks, especially in extremely low bitwidths. In this paper, we propose Fast and Compact Ternary LSTM (FaCT-LSTM), which bridges the accuracy gap between the full precision and quantized neural networks. We propose a hardware-friendly circuit to implement ternarized LSTM and eliminate computation-intensive floating-point operations. With the proposed ternarized LSTM architectures, our experiments on the ECG and EMG signals show ~0.88 to 2.04% accuracy loss in comparison to the full-precision counterparts while reducing latency and area for ~111× to 116× and ~29× to 33×, respectively. The proposed architectures also improves the memory footprint and bandwidth of the full precision signal classification, by 17×, and 31×, respectively.

  • 66.
    Mirsalari, Seyed Ahmad
    et al.
    School of Electrical and Computer Engineering, University of Tehran, Tehran, Iran.
    Sinaei, Sima
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Salehi, Mostafa E.
    School of Electrical and Computer Engineering, University of Tehran, Tehran, Iran.
    Daneshtalab, Masoud
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    MuBiNN: Multi-Level Binarized Recurrent Neural Network for EEG Signal Classification2020In: 2020 IEEE International Symposium on Circuits and Systems (ISCAS), 2020Conference paper (Refereed)
    Abstract [en]

    Recurrent Neural Networks (RNN) are widely used for learning sequences in applications such as EEG classification. Complex RNNs could be hardly deployed on wearable devices due to their computation and memory-intensive processing patterns. Generally, reduction in precision leads much more efficiency and binarized RNNs are introduced as energy-efficient solutions. However, naive binarization methods lead to significant accuracy loss in EEG classification. In this paper, we propose a multi-level binarized LSTM, which significantly reduces computations whereas ensuring an accuracy pretty close to the full precision LSTM. Our method reduces the delay of the 3-bit LSTM cell operation 47× with less than 0.01% accuracy loss.

  • 67.
    Mousavi, Hamid
    et al.
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Loni, Mohammad
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Alibeigi, M.
    Zenseact Ab, Göteborg, Sweden.
    Daneshtalab, Masoud
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems. Computer Systems, Tallinn University of Technology, Tallinn, Estonia.
    DASS: Differentiable Architecture Search for Sparse Neural Networks2023In: ACM Transactions on Embedded Computing Systems, ISSN 1539-9087, E-ISSN 1558-3465, Vol. 22, no 5 s, article id 105Article in journal (Refereed)
    Abstract [en]

    The deployment of Deep Neural Networks (DNNs) on edge devices is hindered by the substantial gap between performance requirements and available computational power. While recent research has made significant strides in developing pruning methods to build a sparse network for reducing the computing overhead of DNNs, there remains considerable accuracy loss, especially at high pruning ratios. We find that the architectures designed for dense networks by differentiable architecture search methods are ineffective when pruning mechanisms are applied to them. The main reason is that the current methods do not support sparse architectures in their search space and use a search objective that is made for dense networks and does not focus on sparsity.This paper proposes a new method to search for sparsity-friendly neural architectures. It is done by adding two new sparse operations to the search space and modifying the search objective. We propose two novel parametric SparseConv and SparseLinear operations in order to expand the search space to include sparse operations. In particular, these operations make a flexible search space due to using sparse parametric versions of linear and convolution operations. The proposed search objective lets us train the architecture based on the sparsity of the search space operations. Quantitative analyses demonstrate that architectures found through DASS outperform those used in the state-of-the-art sparse networks on the CIFAR-10 and ImageNet datasets. In terms of performance and hardware effectiveness, DASS increases the accuracy of the sparse version of MobileNet-v2 from 73.44% to 81.35% (+7.91% improvement) with a 3.87× faster inference time.

  • 68.
    Mousavi, Hamid
    et al.
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Zoljodi, Ali
    Mälardalen University, School of Innovation, Design and Engineering, Innovation and Product Realisation.
    Daneshtalab, Masoud
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems. Tallinn University of Technology (Taltech), Tallinn, Estonia.
    Analysing Robustness of Tiny Deep Neural Networks2023In: Commun. Comput. Info. Sci., Springer Science and Business Media Deutschland GmbH , 2023, p. 150-159Conference paper (Refereed)
    Abstract [en]

    Real-world applications that are safety-critical and resource-constrained necessitate using compact and robust Deep Neural Networks (DNNs) against adversarial data perturbation. MobileNet-tiny has been introduced as a compact DNN to deploy on edge devices to reduce the size of networks. To make DNNs more robust against adversarial data, adversarial training methods have been proposed. However, recent research has investigated the robustness of large-scale DNNs (such as WideResNet), but the robustness of tiny DNNs has not been analysed. In this paper, we analyse how the width of the blocks in MobileNet-tiny affects the robustness of the network against adversarial data perturbation. Specifically, we evaluate natural accuracy, robust accuracy, and perturbation instability metrics on the MobileNet-tiny with various inverted bottleneck blocks with different configurations. We generate configurations for inverted bottleneck blocks using different width-multipliers and expand-ratio hyper-parameters. We discover that expanding the width of the blocks in MobileNet-tiny can improve the natural and robust accuracy but increases perturbation instability. In addition, after a certain threshold, increasing the width of the network does not have significant gains in robust accuracy and increases perturbation instability. We also analyse the relationship between the width-multipliers and expand-ratio hyper-parameters with the Lipchitz constant, both theoretically and empirically. It shows that wider inverted bottleneck blocks tend to have significant perturbation instability. These architectural insights can be useful in developing adversarially robust tiny DNNs for edge devices.

  • 69.
    Mubeen, Saad
    et al.
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Lo Bello, L.
    University of Catania, Italy.
    Daneshtalab, Masoud
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Saponara, S.
    University of Pisa, Italy.
    Guest Editorial: Special issue on parallel, distributed, and network-based processing in next-generation embedded systems2021In: Journal of systems architecture, ISSN 1383-7621, E-ISSN 1873-6165, Vol. 117, article id 102159Article in journal (Refereed)
  • 70.
    Namazi, A.
    et al.
    University of Tehran, Tehran, Iran.
    Abdollahi, M.
    University of Tehran, Tehran, Iran.
    Safari, S.
    University of Tehran, Tehran, Iran.
    Mohammadi, S.
    University of Tehran, Tehran, Iran.
    Daneshtalab, Masoud
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    LRTM: Life-time and Reliability-aware Task Mapping Approach for Heterogeneous Multi-core Systems2018In: 2018 11th International Workshop on Network on Chip Architectures, NoCArc 2018 - In conjunction with the 51st Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2018, Institute of Electrical and Electronics Engineers Inc. , 2018, article id 8541223Conference paper (Refereed)
    Abstract [en]

    Technology scaling, increasing number of components in a single chip, and aging effects have brought severe reliability challenges in multi-core platforms. They are more susceptible to faults, both permanent and transient. This paper proposes a Lifetime and Reliability-aware Task Mapping (LRTM) approach to many-core platforms with heterogeneous cores. It tries to confront both transient faults and wear-out failures. Our proposed approach maintains the predefined level of reliability for the task graph in presence of transient faults over the whole lifetime of the system. LRTM uses replication technique with minimum replica overhead, maximum achievable performance, and minimum temperature increase to confront transient faults while increasing the lifetime of the system. Besides, LRTM specifies task migration plans with the minimum overhead using a novel heuristic approach on the occurrence of permanent core failures due to wear-out mechanisms. Task migration scenarios are used during run-time to increase the lifetime of the system while maintaining reliability threshold of the system. Results show the effectiveness of LRTM improves for bigger mesh sizes and higher reliability thresholds. Simulation results obtained from real benchmarks show the proposed approach decreases design-time calculation up to 4371% compared to exhaustive exploration while achieving lifetime negligibly lower than the exhaustive solution (up to 5.83%). LRTM also increases lifetime about 3% compared to other heuristic approaches in the literature.

  • 71.
    Nazari, N.
    et al.
    University of Tehran, School of Electrical and Computer Engineering, Tehran, Iran.
    Mirsalari, S. A.
    University of Tehran, School of Electrical and Computer Engineering, Tehran, Iran.
    Sinaei, Sima
    Malardalen University, Division of Intelligent Future Technologies, Vasteras, Sweden.
    Salehi, M. E.
    University of Tehran, School of Electrical and Computer Engineering, Tehran, Iran.
    Daneshtalab, Masoud
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Multi-level Binarized LSTM in EEG Classification for Wearable Devices2020In: Proceedings - 2020 28th Euromicro International Conference on Parallel, Distributed and Network-Based Processing, PDP 2020, Institute of Electrical and Electronics Engineers Inc. , 2020, p. 175-181Conference paper (Refereed)
    Abstract [en]

    Long Short-Term Memory (LSTM) is widely used in various sequential applications. Complex LSTMs could be hardly deployed on wearable and resourced-limited devices due to the huge amount of computations and memory requirements. Binary LSTMs are introduced to cope with this problem, however, they lead to significant accuracy loss in some applications such as EEG classification which is essential to be deployed in wearable devices. In this paper, we propose an efficient multi-level binarized LSTM which has significantly reduced computations whereas ensuring an accuracy pretty close to full precision LSTM. By deploying 5-level binarized weights and inputs, our method reduces area and delay of MAC operation about 31× and 27× in 65nm technology, respectively with less than 0.01% accuracy loss. In contrast to many compute-intensive deep-learning approaches, the proposed algorithm is lightweight, and therefore, brings performance efficiency with accurate LSTM-based EEG classification to realtime wearable devices.

  • 72.
    Nazari, Najmeh
    et al.
    University of Tehran, Tehran , Iran.
    Loni, Mohammad
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems. ES (Embedded Systems).
    E. Salehi, Mostafa
    University of Tehran, Tehran , Iran.
    Daneshtalab, Masoud
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Sjödin, Mikael
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    TOT-Net: An Endeavor Toward Optimizing Ternary Neural Networks2019In: 22nd Euromicro Conference on Digital System Design DSD 2019, 2019, p. 305-312, article id 8875067Conference paper (Refereed)
    Abstract [en]

    High computation demands and big memory resources are the major implementation challenges of Convolutional Neural Networks (CNNs) especially for low-power and resource-limited embedded devices. Many binarized neural networks are recently proposed to address these issues. Although they have significantly decreased computation and memory footprint, they have suffered from accuracy loss especially for large datasets. In this paper, we propose TOT-Net, a ternarized neural network with [-1, 0, 1] values for both weights and activation functions that has simultaneously achieved a higher level of accuracy and less computational load. In fact, first, TOT-Net introduces a simple bitwise logic for convolution computations to reduce the cost of multiply operations. To improve the accuracy, selecting proper activation function and learning rate are influential, but also difficult. As the second contribution, we propose a novel piece-wise activation function, and optimized learning rate for different datasets. Our findings first reveal that 0.01 is a preferable learning rate for the studied datasets. Third, by using an evolutionary optimization approach, we found novel piece-wise activation functions customized for TOT-Net. According to the experimental results, TOT-Net achieves 2.15%, 8.77%, and 5.7/5.52% better accuracy compared to XNOR-Net on CIFAR-10, CIFAR-100, and ImageNet top-5/top-1 datasets, respectively.

  • 73.
    Rezaei, A.
    et al.
    Northwestern University (NU), Evanston, IL, United States.
    Daneshtalab, Masoud
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Zhou, H.
    Northwestern University (NU), Evanston, IL, United States.
    Multiobjectivism in Dark Silicon Age2018In: Advances in Computers, vol. 110, Academic Press Inc. , 2018, Vol. 110, p. 83-126Chapter in book (Refereed)
    Abstract [en]

    MCSoCs, with their scalability and parallel computation power, provide an ideal implementation base for modern embedded systems. However, chip designers are facing a design challenge wherein shrinking component sizes though have improved density but started stressing energy budget. This phenomenon, that is called utilization wall, has revolutionized the semiconductor industry by shifting the main purpose of chip design from a performance-driven approach to a complex multiobjective one. The area of the chip which cannot be powered is known as dark silicon. In this chapter, we address the multiobjectivism in dark silicon age. First, we overview state-of-the-art works in a categorized manner. Second, we introduce a NoC-based MCSoC architecture, named shift sprinting, in order to increase overall reliability as well as gain high performance. Third, we explain an application mapping approach, called round rotary mapping, for HWNoC-based MCSoC in order to first balance the usage of wireless links by avoiding congestion over wireless routers and second spread temperature across the whole chip by utilizing dark silicon. Finally, we conclude the chapter by providing a future outlook of dark silicon research trend. 

  • 74.
    Rezaei, A.
    et al.
    Northwestern University, Evanston, United States.
    Zhao, D.
    Old Dominion University, Norfolk, United States.
    Daneshtalab, Masoud
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems. Royal Institute of Technology (KTH), Sweden.
    Zhou, H.
    Northwestern University, Evanston, United States.
    Multi-objective Task Mapping Approach for Wireless NoC in Dark Silicon Age2017In: Proceedings - 2017 25th Euromicro International Conference on Parallel, Distributed and Network-Based Processing, PDP 2017, Institute of Electrical and Electronics Engineers Inc. , 2017, p. 589-592Conference paper (Refereed)
    Abstract [en]

    Hybrid Wireless Network-on-Chip (HWNoC) provides high bandwidth, low latency and flexible topology configurations, making this emerging technology a scalable communication fabric for future Many-Core System-on-Chips (MCSoCs). On the other hand, dark silicon is dominating the chip footage of upcoming MCSoCs since Dennard scaling fails due to the voltage scaling problem that results in higher power densities. Moreover, congestion avoidance and hot-spot prevention are two important challenges of HWNoC-based MCSoCs in dark silicon age, Therefore, in this paper, a novel task mapping approach for HWNoC is introduced in order to first balance the usage of wireless links by avoiding congestion over wireless routers and second spread temperature across the whole chip by utilizing dark silicon. Simulation results show significant improvement in both congestion and temperature control of the system, compared to state-of-The-Art works. 

  • 75.
    Rezaei, Seyyed Hossein SeyyedAghaei
    et al.
    Univ Tehran, Tehran, Iran..
    Modarressi, Mehdi
    Univ Tehran, Tehran, Iran..
    Ausavarungnirun, Rachata
    King Mongkuts Univ Technol, Bangkok, Thailand..
    Sadrosadati, Mohammad
    Univ Tehran, Tehran, Iran..
    Mutlu, Onur
    Swiss Fed Inst Technol, Zurich, Switzerland..
    Daneshtalab, Masoud
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems. Royal Inst Technol KTH.
    NoM: Network-on-Memory for Inter-Bank Data Transfer in Highly-Banked Memories2020In: IEEE COMPUTER ARCHITECTURE LETTERS, E-ISSN 1556-6056, Vol. 19, no 1, p. 80-83Article in journal (Refereed)
    Abstract [en]

    Data copy is a widely-used memory operation in many programs and operating system services. In conventional computers, data copy is often carried out by two separate read and write transactions that pass data back and forth between the DRAM chip and the processor chip. Some prior mechanisms propose to avoid this unnecessary data movement by using the shared internal bus in the DRAM chip to directly copy data within the DRAM chip (e.g., between two DRAM banks). While these methods exhibit superior performance compared to conventional techniques, data copy across different DRAM banks is still greatly slower than data copy within the same DRAM bank. Hence, these techniques have limited benefit for the emerging 3D-stacked memories (e.g., HMC and HBM) that contain hundreds of DRAM banks across multiple memory controllers. In this paper, we present Network-on-Memory (NoM), a lightweight inter-bank data communication scheme that enables direct data copy across both memory banks of a 3D-stacked memory. NoM adopts a TDM-based circuit-switching design, where circuit setup is done by the memory controller. Compared to state-of-the-art approaches, NoM enables both fast data copy between multiple DRAM banks and concurrent data transfer operations. Our evaluation shows that NoM improves the performance of data-intensive workloads by 3.8X and 75 percent, on average, compared to the baseline conventional 3D-stacked DRAM architecture and state-of-the-art techniques, respectively.

  • 76.
    Riazati, Mohammad
    et al.
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Daneshtalab, Masoud
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Sjödin, Mikael
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Lisper, Björn
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    AutoDeepHLS: Deep Neural Network High-level Synthesis using fixed-point precision2022In: 2022 IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE CIRCUITS AND SYSTEMS (AICAS 2022): INTELLIGENT TECHNOLOGY IN THE POST-PANDEMIC ERA, IEEE , 2022, p. 122-125Conference paper (Refereed)
    Abstract [en]

    Deep Neural Networks (DNN) have received much attention in various applications such as visual recognition, self-driving cars, health care, etc. Hardware implementation, specifically using FPGA and ASIC due to their high performance and low power consumption, is considered an efficient method. However, implementation on these platforms is difficult for neural network designers since they usually have limited knowledge of hardware. High-Level Synthesis (HLS) tools can act as a bridge between high-level DNN designs and hardware implementation. Nevertheless, these tools usually need implementation at the C level, whereas the design of neural networks is usually performed at a higher level (such as Keras or TensorFlow). In this paper, we propose a fully automated flow for creating a C-level implementation that is synthesizable with HLS Tools. Various aspects such as performance, minimal access to memory elements, data type knobs, and design verification are considered. Our results show that the generated C implementation is much more HLS friendly than previous works. Furthermore, a complete flow is proposed to determine different fixed-point precisions for network elements. We show that our method results in 25% and 34% reduction in bit-width for LeNet and VGG, respectively, without any accuracy loss.

  • 77. Riazati, Mohammad
    et al.
    Daneshtalab, Masoud
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems. Tallinn University of Technology, Department of Computer Systems, Tallinn, Estonia.
    Sjödin, Mikael
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Lisper, Björn
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    DeepFlexiHLS: Deep Neural Network Flexible High-Level Synthesis Directive Generator2022In: 2022 IEEE Nordic Circuits and Systems Conference, NORCAS 2022 - Proceedings, Institute of Electrical and Electronics Engineers Inc. , 2022Conference paper (Refereed)
    Abstract [en]

    Deep Neural Networks (DNNs) are now widely adopted to solve various problems ranging from speech recognition to image classification. Since DNNs demand a large amount of processing power, their implementation on hardware, i.e., FPGA or ASIC, has received much attention. High-level synthesis is widely used since it significantly boosts productivity and flexibility and requires minimal hardware knowledge. However, when HLS transforms a C implementation to a Register-Transfer Level one, the high parallelism capability of the FPGA is not well-utilized. HLS tools provide a feature called directives through which designers can guide the tool using some defined C pragma statements to improve performance. Nevertheless, finding appropriate directives is another challenge, which needs considerable expertise and experience. This paper proposes DeepFlexiHLS, a two-stage design space exploration flow to find a set of directives to achieve minimal latency. In the first stage, a partition-based method is used to find the directives corresponding to each partition. Aggregating all these directives leads to minimal latency. Experimental results show 54% more speed-up than similar work on VGG neural network. In the second stage, an estimator is implemented to find the latency and resource utilization of various combinations of the found directives. The results form a Pareto-frontier from which the designer can choose if FPGA resources are limited or are not to be entirely used by the DNN module.

  • 78.
    Riazati, Mohammad
    et al.
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Daneshtalab, Masoud
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems. Tallinn University of Technology, Tallinn, Estonia.
    Sjödin, Mikael
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Lisper, Björn
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    DeepHLS: A complete toolchain for automatic synthesis of deep neural networks to FPGA2020In: ICECS 2020 - 27th IEEE International Conference on Electronics, Circuits and Systems, Proceedings, Institute of Electrical and Electronics Engineers Inc. , 2020, article id 9294881Conference paper (Refereed)
    Abstract [en]

    Deep neural networks (DNN) have achieved quality results in various applications of computer vision, especially in image classification problems. DNNs are computational intensive, and nowadays, their acceleration on the FPGA has received much attention. Many methods to accelerate DNNs have been proposed. Despite their performance features like acceptable accuracy or low latency, their use is not widely accepted by software designers who usually do not have enough knowledge of the hardware details of the proposed accelerators. HLS tools are the major promising tools that can act as a bridge between software designers and hardware implementation. However, not only most HLS tools just support C and C++ descriptions as input, but also their result is very sensitive to the coding style. It makes it difficult for the software developers to adopt them, as DNNs are mostly described in high-level languages such as Tensorflow or Keras. In this paper, an integrated toolchain is presented that, in addition to converting the Keras DNN descriptions to a simple, flat, and synthesizable C output, provides other features such as accuracy verification, C level knobs to easily change the data types from floating-point to fixed-point with arbitrary bit width, and latency and area utilization adjustment using HLS knobs. 

  • 79.
    Riazati, Mohammad
    et al.
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Daneshtalab, Masoud
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Sjödin, Mikael
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Lisper, Björn
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    High-Level Synthesis Design Space Exploration for Highly Optimized Deep Neural Network ImplementationManuscript (preprint) (Other academic)
  • 80.
    Riazati, Mohammad
    et al.
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Daneshtalab, Masoud
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Sjödin, Mikael
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Lisper, Björn
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    SHiLA: Synthesizing High-Level Assertions for High-Speed Validation of High-Level Designs2020In: Proceedings - 2020 23rd International Symposium on Design and Diagnostics of Electronic Circuits and Systems, DDECS 2020, Institute of Electrical and Electronics Engineers Inc. , 2020, article id 9095728Conference paper (Refereed)
    Abstract [en]

    In the past, assertions were mostly used to validate the system through the design and simulation process. Later, a new method known as assertion synthesis was introduced, which enabled the designers to use the assertions for high-speed hardware emulation and safety and reliability insurance after tape-out. Although the synthesis of the assertions at the register transfer level is proposed and implemented in several works, none of them can be adopted for high-level assertions. In this paper, we propose the SHiLA framework and a detailed implementation guide by which assertion synthesis can also be applied to the high-level design processes. The proposed method, which is fully tool independent, is not only an enabler to highspeed assertion-Assisted simulation but can also be used in other scenarios that need assertion synthesis, as it has the minimum possible effect on the main design's performance.

  • 81.
    Riazati, Mohammad
    et al.
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems. Mälardalens högskola , Vasteras, Sweden.
    Ghasempouri, T.
    Tallinna Tehnikaülikool, Tallinn, Estonia.
    Daneshtalab, Masoud
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Raik, J.
    Tallinna Tehnikaülikool, Tallinn, Estonia.
    Sjodin, M.
    Mälardalens högskola , Vasteras, Sweden.
    Lisper, Björn
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Adjustable self-healing methodology for accelerated functions in heterogeneous systems2020In: Proceedings - Euromicro Conference on Digital System Design, DSD 2020, Institute of Electrical and Electronics Engineers Inc. , 2020, p. 638-645, article id 9217868Conference paper (Refereed)
    Abstract [en]

    Self-healing is a promising approach for designing reliable digital systems. It refers to the ability of a system to detect faults and automatically fixing them to avoid total failure. With the development of digital systems, heterogeneous systems, in which some parts of the system are executed on the programmable logic, and some other parts run on the processing elements (CPU), are becoming more prevalent. In this work, we propose an adjustable self-healing method that is applicable to heterogeneous systems with accelerated functions and enables the designers to add the self-healing feature to the design. In this method, by manipulating the software codes that are being executed on the processing element, we add the ability to verify the accelerated functions on the programmable logic and heal the possible failures to the system. This is done not only in a straightforward manner but also without being forced to choose a specific reliability-overhead point. The designer will have the option to select the optimum configuration for a desired reliability level. Experimental results on a large design including several accelerated functions are provided and show 42% improvement of reliability by having 27% overhead, as an example of the reliability-overhead point. 

  • 82.
    Salimi, M.
    et al.
    Tehran University, Tehran, Iran.
    Majd, A.
    Åbo Akademi University, Turku, Finland.
    Loni, Mohammad
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Seceleanu, Tiberiu
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Seceleanu, Cristina
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Sirjani, Marjan
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Daneshtalab, Masoud
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Troubitsyna, E.
    Royal Institute of Technology, Stockholm, Sweden.
    Multi-objective optimization of real-time task scheduling problem for distributed environments2020In: PROCEEDINGS OF THE 6TH CONFERENCE ON THE ENGINEERING OF COMPUTER BASED SYSTEMS (ECBS 2019), Association for Computing Machinery , 2020, article id a13Conference paper (Refereed)
    Abstract [en]

    Real-world applications are composed of multiple tasks which usually have intricate data dependencies. To exploit distributed processing platforms, task allocation and scheduling, that is assigning tasks to processing units and ordering inter-processing unit data transfers, plays a vital role. However, optimally scheduling tasks on processing units and finding an optimized network topology is an NP-complete problem. The problem becomes more complicated when the tasks have real-time deadlines for termination. Exploring the whole search space in order to find the optimal solution is not feasible in a reasonable amount of time, therefore meta-heuristics are often used to find a near-optimal solution. We propose here a multi-population evolutionary approach for near-optimal scheduling optimization, that guarantees end-to-end deadlines of tasks in distributed processing environments. We analyze two different exploration scenarios including single and multi-objective exploration. The main goal of the single objective exploration algorithm is to achieve the minimal number of processing units for all the tasks, whereas a multi-objective optimization tries to optimize two conflicting objectives simultaneously considering the total number of processing units and end-to-end finishing time for all the jobs. The potential of the proposed approach is demonstrated by experiments based on a use case for mapping a number of jobs covering industrial automation systems, where each of the jobs consists of a number of tasks in a distributed environment.

  • 83.
    Satka, Zenepe
    et al.
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Ashjaei, Seyed Mohammad Hossein
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Fotouhi, Hossein
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Daneshtalab, Masoud
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Sjödin, Mikael
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Mubeen, Saad
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    A comprehensive systematic review of integration of time sensitive networking and 5G communication2023In: Journal of systems architecture, ISSN 1383-7621, E-ISSN 1873-6165, Vol. 138, article id 102852Article in journal (Refereed)
    Abstract [en]

    Many industrial real-time applications in various domains, e.g., automotive, industrial automation, industrial IoT, and industry 4.0, require ultra-low end-to-end network latency, often in the order of 10 milliseconds or less. The IEEE 802.1 time-sensitive networking (TSN) is a set of standards that supports the required low-latency wired communication with ultra-low jitter. The flexibility of such a wired connection can be increased if it is integrated with a mobile wireless network. The fifth generation of cellular networks (5G) is capable of supporting the required levels of network latency with the Ultra-Reliable Low Latency Communication (URLLC) service. To fully utilize the potential of these two technologies (TSN and 5G) in industrial applications, seamless integration of the TSN wired-based network with the 5G wireless-based network is needed. In this article, we provide a comprehensive and well-structured snapshot of the existing research on TSN-5G integration. In this regard, we present the planning, execution, and analysis results of the systematic review. We also identify the trends, technical characteristics, and potential gaps in the state of the art, thus highlighting future research directions in the integration of TSN and 5G communication technologies. We notice that 73% of the primary studies address the time synchronization in the integration of TSN and 5G technologies, introducing approaches with an accuracy starting from the levels of hundred nanoseconds to one microsecond. Majority of primary studies aim at optimizing communication latency in their approach, which is a key quality attribute in automotive and industrial automation applications today.

  • 84.
    Satka, Zenepe
    et al.
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Ashjaei, Seyed Mohammad Hossein
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Fotouhi, Hossein
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Daneshtalab, Masoud
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Sjödin, Mikael
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Mubeen, Saad
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    QoS-MAN: A Novel QoS Mapping Algorithm for TSN-5G Flows2022In: 2022 IEEE 28TH INTERNATIONAL CONFERENCE ON EMBEDDED AND REAL-TIME COMPUTING SYSTEMS AND APPLICATIONS (RTCSA 2022), IEEE COMPUTER SOC , 2022, p. 220-227Conference paper (Refereed)
    Abstract [en]

    Integrating wired Ethernet networks, such as Time-Sensitive Networks (TSN), to 5G cellular network requires a flow management technique to efficiently map TSN traffic to 5G Quality-of-Service (QoS) flows. The 3GPP Release 16 provides a set of predefined QoS characteristics, such as priority level, packet delay budget, and maximum data burst volume, which can be used for the 5G QoS flows. Within this context, mapping TSN traffic flows to 5G QoS flows in an integrated TSN-5G network is of paramount importance as the mapping can significantly impact on the end-to-end QoS in the integrated network. In this paper, we present a novel and efficient mapping algorithm to map different TSN traffic flows to 5G QoS flows. To the best of our knowledge, this is the first QoS-aware mapping algorithm based on the application constraints used to exchange flows between TSN and 5G network domains. We evaluate the proposed mapping algorithm on synthetic scenarios with random sets of constraints on deadline, jitter, bandwidth, and packet loss rate. The evaluation results show that the proposed mapping algorithm can fulfill over 90% of the applications' constraints.

  • 85.
    Satka, Zenepe
    et al.
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Pantzar, David
    Mälardalen University.
    Magnusson, Alexander
    Mälardalen University.
    Ashjaei, Seyed Mohammad Hossein
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Fotouhi, Hossein
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Sjödin, Mikael
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Daneshtalab, Masoud
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Mubeen, Saad
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Developing a Translation Technique for Converged TSN-5G Communication2022In: IEEE International Workshop on Factory Communication Systems - Proceedings, WFCS, Institute of Electrical and Electronics Engineers Inc. , 2022, p. 103-110Conference paper (Refereed)
    Abstract [en]

    Time Sensitive Networking (TSN) is a set of IEEE standards based on switched Ethernet that aim at meeting high-bandwidth and low-latency requirements in wired communication. TSN implementations typically do not support integration of wireless networks, which limits their applicability to many industrial applications that need both wired and wire-less communication. The development of 5G and its promised Ultra-Reliable and Low-Latency Communication (URLLC) in-tegrated with TSN would offer a promising solution to meet the bandwidth, latency and reliability requirements in these industrial applications. In order to support such an integration, we propose a technique to translate the traffic between TSN and 5G communication technologies. As a proof of concept, we implement the translation technique in a well-known TSN simulator, namely NeSTiNg, that is based on the OMNeT ++ tool. Furthermore, we evaluate the proposed technique using an automotive industrial use case. 

  • 86.
    Sinaei, Sima
    et al.
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Daneshtalab, Masoud
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Hardware acceleration for recurrent neural networks2020In: Hardware Architectures for Deep Learning, Institution of Engineering and Technology , 2020, p. 27-52Chapter in book (Other academic)
    Abstract [en]

    This chapter focuses on the LSTM model and is concerned with the design of a high-performance and energy-efficient solution to implement deep learning inference. The chapter is organized as follows: Section 2.1 introduces Recurrent Neural Networks (RNNs). In this section Long Short Term Memory (LSTM) and Gated Recurrent Unit (GRU) network models are discussed as special kind of RNNs. Section 2.2 discusses inference acceleration with hardware. In Section 2.3, a survey on various FPGA designs is presented within the context of the results of previous related works and after which Section 2.4 concludes the chapter.

  • 87.
    Singh, Rajendra
    et al.
    Manipal Univ Jaipur, Dept Comp & Commun Engn, Jaipur 303007, India..
    Bohra, Manoj Kumar
    Manipal Univ Jaipur, Dept Comp & Commun Engn, Jaipur 303007, India..
    Hemrajani, Prashant
    Manipal Univ Jaipur, Dept Comp & Commun Engn, Jaipur 303007, India..
    Kalla, Anshuman
    Uka Tarsadia Univ, Chhotubhai Gopalbhai Patel Inst Technol CGPIT, Bardoli 394620, Gujarat, India..
    Bhatt, Devershi Pallavi
    Manipal Univ Jaipur, Dept Comp Applicat, Jaipur 303007, India..
    Purohit, Nitin
    Kebri Dehar Univ, Dept Comp Sci, Kebri Dehar 3060, Ethiopia..
    Daneshtalab, Masoud
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems. Tallinn Univ Technol, EE-12616 Tallinn, Estonia..
    Review, Analysis, and Implementation of Path Selection Strategies for 2D NoCs2022In: IEEE Access, E-ISSN 2169-3536, Vol. 10, p. 129245-129268Article in journal (Refereed)
    Abstract [en]

    Recent advances in very-large-scale integration (VLSI) technologies have offered the capability of integrating thousands of processing elements onto a single silicon microchip. Multiprocessor systems-on-chips (MPSoCs) are the latest creation of this technology evolution. As an interconnection network, Network-on-Chip (NoC) has emerged as a scalable and promising solution for MPSoCs to achieve high performance. In NoCs, a routing algorithm is a critical part of a router and provides a path for a packet toward its destination. Every routing algorithm should exhibit two characteristics. First, the route selection function should provide enough degree of adaptiveness to avoid network congestion. Second, it should not offer stale information on network congestion status to the neighboring routers. Many researchers have investigated network congestion and proposed techniques to control/avoid congestion. Such congestion avoidance-based algorithms significantly improve NoC performance. However, they may result in hardware overhead for side network implementation to collect congestion status. This paper reviews various output selection strategies used by routing algorithms to route a packet on a less congested network region. It also classifies them based on techniques adopted to handle and propagate congestion information. Additionally, this article provides the implementation and analysis details of state-of-art selection methods.

  • 88.
    Taheri, M.
    et al.
    Tallinn University of Technology, Tallinn, Estonia.
    Daneshtalab, Masoud
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Raik, J.
    Tallinn University of Technology, Tallinn, Estonia.
    Jenihhin, M.
    Tallinn University of Technology, Tallinn, Estonia.
    Pappalardo, S.
    Ecole Centrale de Lyon, Lyon, France.
    Jimenez, P.
    Ecole Centrale de Lyon, Lyon, France.
    Deveautour, B.
    Ecole Centrale de Lyon, Lyon, France.
    Bosio, A.
    Ecole Centrale de Lyon, Lyon, France.
    SAFFIRA: a Framework for Assessing the Reliability of Systolic-Array-Based DNN Accelerators2024In: 2024 27th International Symposium on Design & Diagnostics of Electronic Circuits & Systems (DDECS), Institute of Electrical and Electronics Engineers Inc. , 2024, p. 19-24Conference paper (Refereed)
    Abstract [en]

    Systolic array has emerged as a prominent archi-tecture for Deep Neural Network (DNN) hardware accelerators, providing high-throughput and low-latency performance essen-tial for deploying DNNs across diverse applications. However, when used in safety-critical applications, reliability assessment is mandatory to guarantee the correct behavior of DNN accelerators. While fault injection stands out as a well-established practical and robust method for reliability assessment, it is still a very time-consuming process. This paper addresses the time efficiency issue by introducing a novel hierarchical software-based hardware-aware fault injection strategy tailored for systolic array-based DNN accelerators. The uniform Recurrent Equations system is used for software modeling of the systolic-array core of the DNN accelerators. The approach demonstrates a reduction of the fault injection time up to 3 × compared to the state-of-the-art hybrid (software/hardware) hardware-aware fault injection frameworks and more than 2000 × compared to RT-level fault injection frameworks - without compromising accuracy. Additionally, we propose and evaluate a new reliability metric through experimental assessment. The performance of the framework is studied on state-of-the-art DNN benchmarks.

  • 89.
    Taheri, M.
    et al.
    Tallinn University of Technology, Tallinn, Estonia.
    Riazati, Mohammad
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Ahmadilivani, M. H.
    Tallinn University of Technology, Tallinn, Estonia.
    Jenihhin, M.
    Tallinn University of Technology, Tallinn, Estonia.
    Daneshtalab, Masoud
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Raik, J.
    Tallinn University of Technology, Tallinn, Estonia.
    Sjodin, M.
    Lisper, Björn
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    DeepAxe: A Framework for Exploration of Approximation and Reliability Trade-offs in DNN Accelerators2023In: Proceedings - International Symposium on Quality Electronic Design, ISQED, IEEE Computer Society , 2023Conference paper (Refereed)
    Abstract [en]

    While the role of Deep Neural Networks (DNNs) in a wide range of safety-critical applications is expanding, emerging DNNs experience massive growth in terms of computation power. It raises the necessity of improving the reliability of DNN accelerators yet reducing the computational burden on the hardware platforms, i.e. reducing the energy consumption and execution time as well as increasing the efficiency of DNN accelerators. Therefore, the trade-off between hardware performance, i.e. area, power and delay, and the reliability of the DNN accelerator implementation becomes critical and requires tools for analysis.In this paper, we propose a framework DeepAxe for design space exploration for FPGA-based implementation of DNNs by considering the trilateral impact of applying functional approximation on accuracy, reliability and hardware performance. The framework enables selective approximation of reliability-critical DNNs, providing a set of Pareto-optimal DNN implementation design space points for the target resource utilization requirements. The design flow starts with a pre-trained network in Keras, uses an innovative high-level synthesis environment DeepHLS and results in a set of Pareto-optimal design space points as a guide for the designer. The framework is demonstrated on a case study of custom and state-of-the-art DNNs and datasets. 

  • 90.
    Taheri, Mahdi
    et al.
    Tallinn University of Technology, Estonia.
    Ahmadilivani, Mohammad H.
    Tallinn University of Technology, Estonia.
    Jenihhin, Maksim
    Tallinn University of Technology, Estonia.
    Daneshtalab, Masoud
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems. Tallinn University of Technology, Estonia.
    Raik, Jaan
    Tallinn University of Technology, Estonia.
    APPRAISER: DNN Fault Resilience Analysis Employing Approximation Errors2023In: Proceedings - 2023 26th International Symposium on Design and Diagnostics of Electronic Circuits and Systems, DDECS 2023, Institute of Electrical and Electronics Engineers Inc. , 2023, p. 124-127Conference paper (Refereed)
    Abstract [en]

    Nowadays, the extensive exploitation of Deep Neural Networks (DNNs) in safety-critical applications raises new reliability concerns. In practice, methods for fault injection by emulation in hardware are efficient and widely used to study the resilience of DNN architectures for mitigating reliability issues already at the early design stages. However, the state-of-the-art methods for fault injection by emulation incur a spectrum of time-, design-and control-complexity problems. To overcome these issues, a novel resiliency assessment method called APPRAISER is proposed that applies functional approximation for a non-conventional purpose and employs approximate computing errors for its interest. By adopting this concept in the resiliency assessment domain, APPRAISER provides thousands of times speed-up in the assessment process, while keeping high accuracy of the analysis. In this paper, APPRAISER is validated by comparing it with state-of-the-art approaches for fault injection by emulation in FPGA. By this, the feasibility of the idea is demonstrated, and a new perspective in resiliency evaluation for DNNs is opened.

  • 91.
    Vidimlic, Najda
    et al.
    Mälardalen University.
    Levin, Alexandra
    Mälardalen University.
    Loni, Mohammad
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Daneshtalab, Masoud
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Image synthesisation and data augmentation for safe object detection in aircraft auto-landing system2021In: VISIGRAPP 2021 - Proceedings of the 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, SciTePress , 2021, Vol. 5, p. 123-135Conference paper (Refereed)
    Abstract [en]

    The feasibility of deploying object detection to interpret the environment is questioned in several mission-critical applications leading to raised concerns about the ability of object detectors in providing reliable and safe predictions of the operational environment, regardless of weather and light conditions. The lack of a comprehensive dataset, which causes class imbalance and detection difficulties of hard examples, is one of the main reasons of accuracy loss in attitude safe object detection. Data augmentation, as an implicit regularisation technique, has been shown to significantly improve object detection by increasing both the diversity and the size of the training dataset. Despite the success of data augmentation in various computer vision tasks, applying data augmentation techniques to improve safety has not been sufficiently addressed in the literature. In this paper, we leverage a set of data augmentation techniques to improve the safety of object detection. The aircraft in-flight image data is used to evaluate the feasibility of our proposed solution in real-world safety-required scenarios. To achieve our goal, we first generate a training dataset by synthesising the images collected from in-flight recordings. Next, we augment the generated dataset to cover real weather and lighting changes. Introduction of artificially produced distortions is also known as corruptions and has since recently been an approach to enrich the dataset. The introduction of corruptions, as augmentations of weather and luminance in combination with the introduction of artificial artefacts, is done as an approach to achieve a comprehensive representation of an aircraft’s operational environment. Finally, we evaluate the impact of data augmentation on the studied dataset. Faster R-CNN with ResNet-50-FPN was used as an object detector for the experiments. An AP@[IoU=.5:.95] score of 50.327% was achieved with the initial setup, while exposure to altered weather and lighting conditions yielded an 18.1% decrease. The introduction of the conditions into the training set led to a 15.6% increase in comparison to the score achieved from exposure to the conditions. 

  • 92.
    Yazdanpanah, Fahimeh
    et al.
    Vali E Asr Univ, Fac Engn, Dept Comp Engn, Rafsanjan, Iran..
    AfsharMazayejani, Raheel
    Shahid Bahonar Univ, Dept Comp Engn, Fac Engn, Kemran, Iran..
    Alaei, Mohammad
    Vali E Asr Univ, Fac Engn, Dept Comp Engn, Rafsanjan, Iran..
    Rezaei, Amin
    Northwestern Univ, Evanston, IL USA..
    Daneshtalab, Masoud
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    An energy-efficient partition-based XYZ-planar routing algorithm for a wireless network-on-chip2019In: Journal of Supercomputing, ISSN 0920-8542, E-ISSN 1573-0484, Vol. 75, no 2, p. 837-861Article in journal (Refereed)
    Abstract [en]

    In the current many-core architectures, network-on-chips (NoCs) have been efficiently utilized as communication backbones for enabling massive parallelism and high degree of integration on a chip. In spite of the advantages of conventional NoCs, wired multi-hop links impose limitations on their performance by long delay and much power consumption especially in large systems. To overcome these limitations, different solutions such as using wireless interconnections have been proposed. Utilizing long-range, high bandwidth and low power wireless links can lead to solve the problems corresponding to wired links. Meanwhile, the grid-like mesh is the most stable topology in conventional NoC designs. That is why most of the wireless network-on-chip (WNoC) architectures have been designed based on this topology. The goals of this article are to challenge mesh topology and to demonstrate the efficiency of honeycomb-based WNoC architectures. In this article, we propose HoneyWiN, hybrid wired/wireless NoC architecture with honeycomb topology. Also, a partition-based XYZ-planar routing algorithm for energy conservation is proposed. In order to demonstrate the advantages of the proposed architecture, first, an analytical comparison of HoneyWiN with a mesh-based WNoC, as the baseline architecture, is carried out. In order to compare the proposed architecture, we implement our partition-based routing algorithm in the form of 2-axes coordinate system in the baseline architecture. Simulation results show that HoneyWiN reduces about 17% of energy consumption while increases the throughput by 10% compared to the mesh-based WNoC. Then, HoneyWiN is compared with four state-of-the-art mesh-based NoC architectures. In all of the evaluations, HoneyWiN provides higher performance in term of delay, throughput and energy consumption. Overall, the results indicate that HoneyWiN is very effective in improving throughput, increasing speed and reducing energy consumption.

12 51 - 92 of 92
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf