https://www.mdu.se/

mdu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
TOT-Net: An Endeavor Toward Optimizing Ternary Neural Networks
University of Tehran, Tehran , Iran.
Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems. ES (Embedded Systems).ORCID iD: 0000-0002-9704-7117
University of Tehran, Tehran , Iran.
Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
Show others and affiliations
2019 (English)In: 22nd Euromicro Conference on Digital System Design DSD 2019, 2019, p. 305-312, article id 8875067Conference paper, Published paper (Refereed)
Abstract [en]

High computation demands and big memory resources are the major implementation challenges of Convolutional Neural Networks (CNNs) especially for low-power and resource-limited embedded devices. Many binarized neural networks are recently proposed to address these issues. Although they have significantly decreased computation and memory footprint, they have suffered from accuracy loss especially for large datasets. In this paper, we propose TOT-Net, a ternarized neural network with [-1, 0, 1] values for both weights and activation functions that has simultaneously achieved a higher level of accuracy and less computational load. In fact, first, TOT-Net introduces a simple bitwise logic for convolution computations to reduce the cost of multiply operations. To improve the accuracy, selecting proper activation function and learning rate are influential, but also difficult. As the second contribution, we propose a novel piece-wise activation function, and optimized learning rate for different datasets. Our findings first reveal that 0.01 is a preferable learning rate for the studied datasets. Third, by using an evolutionary optimization approach, we found novel piece-wise activation functions customized for TOT-Net. According to the experimental results, TOT-Net achieves 2.15%, 8.77%, and 5.7/5.52% better accuracy compared to XNOR-Net on CIFAR-10, CIFAR-100, and ImageNet top-5/top-1 datasets, respectively.

Place, publisher, year, edition, pages
2019. p. 305-312, article id 8875067
Keywords [en]
convolutional neural networks, ternary neural network, activation function, optimization
National Category
Engineering and Technology Computer Systems
Identifiers
URN: urn:nbn:se:mdh:diva-45042DOI: 10.1109/DSD.2019.00052ISI: 000722275400043Scopus ID: 2-s2.0-85074915397OAI: oai:DiVA.org:mdh-45042DiVA, id: diva2:1345197
Conference
22nd Euromicro Conference on Digital System Design DSD 2019, 28 Aug 2019, Chalkidiki, Greece
Projects
DPAC - Dependable Platforms for Autonomous systems and ControlDeepMaker: Deep Learning Accelerator on Commercial Programmable DevicesAvailable from: 2019-08-23 Created: 2019-08-23 Last updated: 2022-11-08Bibliographically approved
In thesis
1. DeepMaker: Customizing the Architecture of Convolutional Neural Networks for Resource-Constrained Platforms
Open this publication in new window or tab >>DeepMaker: Customizing the Architecture of Convolutional Neural Networks for Resource-Constrained Platforms
2020 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]

Convolutional Neural Networks (CNNs) suffer from energy-hungry implementation due to requiring huge amounts of computations and significant memory consumption. This problem will be more highlighted by the proliferation of CNNs on resource-constrained platforms in, e.g., embedded systems. In this thesis, we focus on decreasing the computational cost of CNNs in order to be appropriate for resource-constrained platforms. The thesis work proposes two distinct methods to tackle the challenges: optimizing CNN architecture while considering network accuracy and network complexity, and proposing an optimized ternary neural network to compensate the accuracy loss of network quantization methods. We evaluated the impact of our solutions on Commercial-Off-The-Shelf (COTS) platforms where the results show considerable improvement in network accuracy and energy efficiency.

Abstract [sv]

Convolutional Neural Networks (CNNs) lider av energihungriga implementationer på grund av att de kräver enorm beräkningskapacitet och har en betydande minneskonsumtion. Detta problem kommer att framhävas mer när allt fler CNN implementeras på resursbegränsade plattformar i inbyggda datorsystem. I denna uppsats fokuserar vi på att minska resursåtgången för CNN, i termer av behövda beräkningar och behövt minne, för att vara lämplig för resursbegränsade plattformar. Vi föreslår två metoder för att hantera utmaningarna; optimera CNN-arkitektur där man balanserar nätverksnoggrannhet och nätverkskomplexitet, och föreslår ett optimerat ternärt neuralt nätverk för att kompensera noggrannhetsförluster som kan uppstå vid nätverkskvantiseringsmetoder. Vi utvärderade effekterna av våra lösningar på kommersiellt använda plattformar (COTS) där resultaten visar betydande förbättringar i nätverksnoggrannhet och energieffektivitet.

Place, publisher, year, edition, pages
Västerås: Mälardalen University, 2020
Series
Mälardalen University Press Licentiate Theses, ISSN 1651-9256 ; 299
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
urn:nbn:se:mdh:diva-52113 (URN)978-91-7485-490-9 (ISBN)
Presentation
2020-12-04, U2-024 (+ Online/Zoom), Mälardalens högskola, Västerås, 11:30 (English)
Opponent
Supervisors
Projects
DeepMaker: Deep Learning Accelerator on Commercial Programmable DevicesDPAC - Dependable Platforms for Autonomous systems and ControlFAST-ARTS: Fast and Sustainable Analysis Techniques for Advanced Real-Time Systems
Available from: 2020-11-10 Created: 2020-10-29 Last updated: 2020-11-13Bibliographically approved
2. Efficient Design of Scalable Deep Neural Networks for Resource-Constrained Edge Devices
Open this publication in new window or tab >>Efficient Design of Scalable Deep Neural Networks for Resource-Constrained Edge Devices
2022 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Deep Neural Networks (DNNs) are increasingly being processed on resource-constrained edge nodes (computer nodes used in, e.g., cyber-physical systems or at the edge of computational clouds) due to efficiency, connectivity, and privacy concerns. This thesis investigates and presents new techniques to design and deploy DNNs for resource-constrained edge nodes. We have identified two major bottlenecks that hinder the proliferation of DNNs on edge nodes: (i) the significant computational demand for designing DNNs that consumes a low amount of resources in terms of energy, latency, and memory footprint; and (ii) further conserving resources by quantizing the numerical calculations of a DNN provides remarkable accuracy degradation.

To address (i), we present novel methods for cost-efficient Neural Architecture Search (NAS) to automate the design of DNNs that should meet multifaceted goals such as accuracy and hardware performance. To address (ii), we extend our NAS approach to handle the quantization of numerical calculations by using only the numbers -1, 0, and 1 (so-called ternary DNNs), which achieves higher accuracy. Our experimental evaluation shows that the proposed NAS approach can provide a 5.25x reduction in design time and up to 44.4x reduction in network size compared to state-of-the-art methods. In addition, the proposed quantization approach delivers 2.64% higher accuracy and 2.8x memory saving compared to full-precision counterparts with the same bit-width resolution. These benefits are attained over a wide range of commercial-off-the-shelf edge nodes showing this thesis successfully provides seamless deployment of DNNs on resource-constrained edge nodes.

Place, publisher, year, edition, pages
Västerås: Mälardalens universitet, 2022
Series
Mälardalen University Press Dissertations, ISSN 1651-4238 ; 363
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
urn:nbn:se:mdh:diva-59946 (URN)978-91-7485-563-0 (ISBN)
Public defence
2022-10-13, Delta och online, Mälardalens universitet, Västerås, 13:30 (English)
Opponent
Supervisors
Projects
AutoDeep: Automatic Design of Safe, High-Performance and Compact Deep Learning Models for Autonomous VehiclesDPAC - Dependable Platforms for Autonomous systems and Control
Available from: 2022-09-15 Created: 2022-09-14 Last updated: 2022-11-08Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Loni, MohammadDaneshtalab, MasoudSjödin, Mikael

Search in DiVA

By author/editor
Loni, MohammadDaneshtalab, MasoudSjödin, Mikael
By organisation
Embedded Systems
Engineering and TechnologyComputer Systems

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 140 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf