https://www.mdu.se/

mdu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
DeepMaker: A multi-objective optimization framework for deep neural networks in embedded systems
Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.ORCID iD: 0000-0002-9704-7117
Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
Shiraz University of Technology, Shiraz, Iran.
Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
Show others and affiliations
2020 (English)In: Microprocessors and microsystems, ISSN 0141-9331, E-ISSN 1872-9436, Vol. 73, article id 102989Article in journal (Refereed) Published
Abstract [en]

Deep Neural Networks (DNNs) are compute-intensive learning models with growing applicability in a wide range of domains. Due to their computational complexity, DNNs benefit from implementations that utilize custom hardware accelerators to meet performance and response time as well as classification accuracy constraints. In this paper, we propose DeepMaker framework that aims to automatically design a set of highly robust DNN architectures for embedded devices as the closest processing unit to the sensors. DeepMaker explores and prunes the design space to find improved neural architectures. Our proposed framework takes advantage of a multi-objective evolutionary approach that exploits a pruned design space inspired by a dense architecture. DeepMaker considers the accuracy along with the network size factor as two objectives to build a highly optimized network fitting with limited computational resource budgets while delivers an acceptable accuracy level. In comparison with the best result on the CIFAR-10 dataset, a generated network by DeepMaker presents up to a 26.4x compression rate while loses only 4% accuracy. Besides, DeepMaker maps the generated CNN on the programmable commodity devices, including ARM Processor, High-Performance CPU, GPU, and FPGA. 

Place, publisher, year, edition, pages
Elsevier B.V. , 2020. Vol. 73, article id 102989
National Category
Computer Systems
Identifiers
URN: urn:nbn:se:mdh:diva-46792DOI: 10.1016/j.micpro.2020.102989ISI: 000520940000032Scopus ID: 2-s2.0-85077516447OAI: oai:DiVA.org:mdh-46792DiVA, id: diva2:1388110
Available from: 2020-01-23 Created: 2020-01-23 Last updated: 2022-11-25Bibliographically approved
In thesis
1. DeepMaker: Customizing the Architecture of Convolutional Neural Networks for Resource-Constrained Platforms
Open this publication in new window or tab >>DeepMaker: Customizing the Architecture of Convolutional Neural Networks for Resource-Constrained Platforms
2020 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]

Convolutional Neural Networks (CNNs) suffer from energy-hungry implementation due to requiring huge amounts of computations and significant memory consumption. This problem will be more highlighted by the proliferation of CNNs on resource-constrained platforms in, e.g., embedded systems. In this thesis, we focus on decreasing the computational cost of CNNs in order to be appropriate for resource-constrained platforms. The thesis work proposes two distinct methods to tackle the challenges: optimizing CNN architecture while considering network accuracy and network complexity, and proposing an optimized ternary neural network to compensate the accuracy loss of network quantization methods. We evaluated the impact of our solutions on Commercial-Off-The-Shelf (COTS) platforms where the results show considerable improvement in network accuracy and energy efficiency.

Abstract [sv]

Convolutional Neural Networks (CNNs) lider av energihungriga implementationer på grund av att de kräver enorm beräkningskapacitet och har en betydande minneskonsumtion. Detta problem kommer att framhävas mer när allt fler CNN implementeras på resursbegränsade plattformar i inbyggda datorsystem. I denna uppsats fokuserar vi på att minska resursåtgången för CNN, i termer av behövda beräkningar och behövt minne, för att vara lämplig för resursbegränsade plattformar. Vi föreslår två metoder för att hantera utmaningarna; optimera CNN-arkitektur där man balanserar nätverksnoggrannhet och nätverkskomplexitet, och föreslår ett optimerat ternärt neuralt nätverk för att kompensera noggrannhetsförluster som kan uppstå vid nätverkskvantiseringsmetoder. Vi utvärderade effekterna av våra lösningar på kommersiellt använda plattformar (COTS) där resultaten visar betydande förbättringar i nätverksnoggrannhet och energieffektivitet.

Place, publisher, year, edition, pages
Västerås: Mälardalen University, 2020
Series
Mälardalen University Press Licentiate Theses, ISSN 1651-9256 ; 299
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
urn:nbn:se:mdh:diva-52113 (URN)978-91-7485-490-9 (ISBN)
Presentation
2020-12-04, U2-024 (+ Online/Zoom), Mälardalens högskola, Västerås, 11:30 (English)
Opponent
Supervisors
Projects
DeepMaker: Deep Learning Accelerator on Commercial Programmable DevicesDPAC - Dependable Platforms for Autonomous systems and ControlFAST-ARTS: Fast and Sustainable Analysis Techniques for Advanced Real-Time Systems
Available from: 2020-11-10 Created: 2020-10-29 Last updated: 2020-11-13Bibliographically approved
2. Efficient Design of Scalable Deep Neural Networks for Resource-Constrained Edge Devices
Open this publication in new window or tab >>Efficient Design of Scalable Deep Neural Networks for Resource-Constrained Edge Devices
2022 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Deep Neural Networks (DNNs) are increasingly being processed on resource-constrained edge nodes (computer nodes used in, e.g., cyber-physical systems or at the edge of computational clouds) due to efficiency, connectivity, and privacy concerns. This thesis investigates and presents new techniques to design and deploy DNNs for resource-constrained edge nodes. We have identified two major bottlenecks that hinder the proliferation of DNNs on edge nodes: (i) the significant computational demand for designing DNNs that consumes a low amount of resources in terms of energy, latency, and memory footprint; and (ii) further conserving resources by quantizing the numerical calculations of a DNN provides remarkable accuracy degradation.

To address (i), we present novel methods for cost-efficient Neural Architecture Search (NAS) to automate the design of DNNs that should meet multifaceted goals such as accuracy and hardware performance. To address (ii), we extend our NAS approach to handle the quantization of numerical calculations by using only the numbers -1, 0, and 1 (so-called ternary DNNs), which achieves higher accuracy. Our experimental evaluation shows that the proposed NAS approach can provide a 5.25x reduction in design time and up to 44.4x reduction in network size compared to state-of-the-art methods. In addition, the proposed quantization approach delivers 2.64% higher accuracy and 2.8x memory saving compared to full-precision counterparts with the same bit-width resolution. These benefits are attained over a wide range of commercial-off-the-shelf edge nodes showing this thesis successfully provides seamless deployment of DNNs on resource-constrained edge nodes.

Place, publisher, year, edition, pages
Västerås: Mälardalens universitet, 2022
Series
Mälardalen University Press Dissertations, ISSN 1651-4238 ; 363
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
urn:nbn:se:mdh:diva-59946 (URN)978-91-7485-563-0 (ISBN)
Public defence
2022-10-13, Delta och online, Mälardalens universitet, Västerås, 13:30 (English)
Opponent
Supervisors
Projects
AutoDeep: Automatic Design of Safe, High-Performance and Compact Deep Learning Models for Autonomous VehiclesDPAC - Dependable Platforms for Autonomous systems and Control
Available from: 2022-09-15 Created: 2022-09-14 Last updated: 2022-11-08Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Loni, MohammadSinaei, SimaDaneshtalab, MasoudSjödin, Mikael

Search in DiVA

By author/editor
Loni, MohammadSinaei, SimaDaneshtalab, MasoudSjödin, Mikael
By organisation
Embedded SystemsEmbedded Systems
In the same journal
Microprocessors and microsystems
Computer Systems

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 515 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf