https://www.mdu.se/

mdu.sePublications
Change search
Link to record
Permanent link

Direct link
Daneshtalab, Masoud
Alternative names
Publications (10 of 107) Show all publications
Taheri, M., Cherezova, N., Nazari, S., Azarpeyvand, A., Ghasempouri, T., Daneshtalab, M., . . . Jenihhin, M. (2025). AdAM: Adaptive Approximate Multiplier for Fault Tolerance in DNN Accelerators. IEEE transactions on device and materials reliability, 25(1), 66-75
Open this publication in new window or tab >>AdAM: Adaptive Approximate Multiplier for Fault Tolerance in DNN Accelerators
Show others...
2025 (English)In: IEEE transactions on device and materials reliability, ISSN 1530-4388, E-ISSN 1558-2574, Vol. 25, no 1, p. 66-75Article in journal (Refereed) Published
Abstract [en]

Deep Neural Network (DNN) hardware accelerators are essential in a spectrum of safety-critical edge-AI applications with stringent reliability, energy efficiency, and latency requirements. Multiplication is the most resource-hungry operation in the neural network's processing elements. This paper proposes a scalable adaptive fault-tolerant approximate multiplier (AdAM) tailored for ASIC-based DNN accelerators at the algorithm and circuit levels. AdAM employs an adaptive adder that relies on an unconventional use of input Leading One Detector (LOD) values for fault detection by optimizing unutilized adder resources. A gate-level optimized LOD design and a hybrid adder design are also proposed as a part of the adaptive multiplier to improve the hardware performance. The proposed architecture uses a lightweight fault mitigation technique that sets the detected faulty bits to zero. The hardware resource utilization and the DNN accelerator's reliability metrics are used to compare the proposed solution against the Triple Modular Redundancy (TMR) in multiplication, unprotected exact multiplication, and unprotected approximate multiplication. It is demonstrated that the proposed architecture enables a multiplication with a reliability level close to the multipliers protected by TMR while at the same time utilizing 2.74x less area and with 39.06% less power-delay product compared to the exact multiplier. Moreover, it has similar area, delay, and power consumption parameters compared to the state-of-the-art approximate multipliers with similar accuracy while providing fault detection and mitigation capability.

Place, publisher, year, edition, pages
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC, 2025
Keywords
Accuracy, Fault tolerant systems, Adders, Hardware, Artificial neural networks, Resource management, Reliability engineering, Integrated circuit reliability, Fault detection, Prevention and mitigation, Deep neural networks, approximate computing, circuit design, reliability, DNN accelerator
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:mdh:diva-70554 (URN)10.1109/TDMR.2024.3523386 (DOI)001449689000004 ()2-s2.0-105001086760 (Scopus ID)
Available from: 2025-03-31 Created: 2025-03-31 Last updated: 2025-04-09Bibliographically approved
Ahmadilivani, M. H., Taheri, M., Raik, J., Daneshtalab, M. & Jenihhin, M. (2024). A Systematic Literature Review on Hardware Reliability Assessment Methods for Deep Neural Networks. ACM Computing Surveys, 56(6), Article ID 141.
Open this publication in new window or tab >>A Systematic Literature Review on Hardware Reliability Assessment Methods for Deep Neural Networks
Show others...
2024 (English)In: ACM Computing Surveys, ISSN 0360-0300, E-ISSN 1557-7341, Vol. 56, no 6, article id 141Article in journal (Refereed) Published
Abstract [en]

Artificial Intelligence (AI) and, in particular, Machine Learning (ML), have emerged to be utilized in various applications due to their capability to learn how to solve complex problems. Over the past decade, rapid advances in ML have presented Deep Neural Networks (DNNs) consisting of a large number of neurons and layers. DNN Hardware Accelerators (DHAs) are leveraged to deploy DNNs in the target applications. Safety-critical applications, where hardware faults/errors would result in catastrophic consequences, also benefit from DHAs. Therefore, the reliability of DNNs is an essential subject of research. In recent years, several studies have been published accordingly to assess the reliability of DNNs. In this regard, various reliability assessment methods have been proposed on a variety of platforms and applications. Hence, there is a need to summarize the state-of-the-art to identify the gaps in the study of the reliability of DNNs. In this work, we conduct a Systematic Literature Review (SLR) on the reliability assessment methods of DNNs to collect relevant research works as much as possible, present a categorization of them, and address the open challenges. Through this SLR, three kinds of methods for reliability assessment of DNNs are identified, including Fault Injection (FI), Analytical, and Hybrid methods. Since the majority of works assess the DNN reliability by FI, we characterize different approaches and platforms of the FI method comprehensively. Moreover, Analytical and Hybrid methods are propounded. Thus, different reliability assessment methods for DNNs have been elaborated on their conducted DNN platforms and reliability evaluation metrics. Finally, we highlight the advantages and disadvantages of the identified methods and address the open challenges in the research area. We have concluded that Analytical and Hybrid methods are light-weight yet sufficiently accurate and have the potential to be extended in future research and to be utilized in establishing novel DNN reliability assessment frameworks.

Place, publisher, year, edition, pages
ASSOC COMPUTING MACHINERY, 2024
Keywords
Reliability assessment, deep neural networks, DNN hardware accelerator, fault injection
National Category
Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
urn:nbn:se:mdh:diva-66411 (URN)10.1145/3638242 (DOI)001208566200007 ()2-s2.0-85188964919 (Scopus ID)
Available from: 2024-04-10 Created: 2024-04-10 Last updated: 2024-05-15Bibliographically approved
Taheri, M., Cherezova, N., Nazari, S., Rafiq, A., Azarpeyvand, A., Ghasempouri, T., . . . Jenihhin, M. (2024). AdAM: Adaptive Fault-Tolerant Approximate Multiplier for Edge DNN Accelerators. In: Proceedings of the European Test Workshop: . Paper presented at 29th IEEE European Test Symposium, ETS 2024, The Hague, Netherlands, 20-24 May 2024. Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>AdAM: Adaptive Fault-Tolerant Approximate Multiplier for Edge DNN Accelerators
Show others...
2024 (English)In: Proceedings of the European Test Workshop, Institute of Electrical and Electronics Engineers (IEEE), 2024Conference paper, Published paper (Refereed)
Abstract [en]

Multiplication is the most resource-hungry operation in the neural network's processing elements. In this paper, we propose an architecture of a novel adaptive fault-tolerant approximate multiplier tailored for ASIC-based DNN accelerators. AdAM employs an adaptive adder relying on an unconventional use of the leading one position value of the inputs for fault detection through the optimization of unutilized adder resources. The proposed architecture uses a lightweight fault mitigation technique that sets the detected faulty bits to zero. The hardware resource utilization and the DNN accelerator's reliability metrics are used to compare the proposed solution against the triple modular redundancy (TMR) in multiplication, unprotected exact multiplication, and unprotected approximate multiplication. It is demonstrated that the proposed architecture enables a multiplication with a reliability level close to the multipliers protected by TMR utilizing 63.54% less area and having 39.06% lower power-delay product compared to the exact multiplier.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
Keywords
approximate computing, circuits design, deep neural networks, reliability, resiliency assessment, Convolution, Fault detection, Fault tolerance, Fault tolerant computer systems, Network architecture, Redundancy, Timing circuits, Adaptive fault tolerant, Circuit designs, Faults detection, Network-processing elements, Neural-network processing, Position value, Proposed architectures, Triple modular redundancy, Adders
National Category
Computer Systems
Identifiers
urn:nbn:se:mdh:diva-68072 (URN)10.1109/ETS61313.2024.10567161 (DOI)001260970400008 ()2-s2.0-85197518684 (Scopus ID)9798350349320 (ISBN)
Conference
29th IEEE European Test Symposium, ETS 2024, The Hague, Netherlands, 20-24 May 2024
Available from: 2024-07-17 Created: 2024-07-17 Last updated: 2024-08-28Bibliographically approved
Lindén, J., Ermedahl, A., Salomonsson, H., Daneshtalab, M., Forsberg, B. & Carbone, P. (2024). Autonomous Realization of Safety- and Time-Critical Embedded Artificial Intelligence. In: 2024 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION, DATE: . Paper presented at 27th Design, Automation and Test in Europe Conference and Exhibition (DATE), MAR 25-27, 2024, Valencia, SPAIN. Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Autonomous Realization of Safety- and Time-Critical Embedded Artificial Intelligence
Show others...
2024 (English)In: 2024 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION, DATE, Institute of Electrical and Electronics Engineers (IEEE), 2024Conference paper, Published paper (Refereed)
Abstract [en]

There is an evident need to complement embedded critical control logic with AI inference, but today's AI-capable hardware, software, and processes are primarily targeted towards the needs of cloud-centric actors. Telecom and defense airspace industries, which make heavy use of specialized hardware, face the challenge of manually hand-tuning AI workloads and hardware, presenting an unprecedented cost and complexity due to the diversity and sheer number of deployed instances. Furthermore, embedded AI functionality must not adversely affect real-time and safety requirements of the critical business logic. To address this, end-to-end AI pipelines for critical platforms are needed to automate the adaption of networks to fit into resource-constrained devices under critical and real-time constraints, while remaining interoperable with de-facto standard AI tools and frameworks used in the cloud. We present two industrial applications where such solutions are needed to bring AI to critical and resource-constrained hardware, and a generalized end-to-end AI pipeline that addresses these needs. Crucial steps to realize it are taken in the industry-academia collaborative FASTER-AI project.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
Series
Design Automation and Test in Europe Conference and Exhibition, ISSN 1530-1591
Keywords
machine learning, embedded systems
National Category
Computer Systems
Identifiers
urn:nbn:se:mdh:diva-68988 (URN)10.23919/DATE58400.2024.10546824 (DOI)001253778900307 ()979-8-3503-4860-6 (ISBN)
Conference
27th Design, Automation and Test in Europe Conference and Exhibition (DATE), MAR 25-27, 2024, Valencia, SPAIN
Available from: 2024-11-13 Created: 2024-11-13 Last updated: 2025-02-12Bibliographically approved
Houtan, B., Ashjaei, S. M., Daneshtalab, M., Sjödin, M. & Mubeen, S. (2024). Bandwidth Reservation Analysis for Schedulability of AVB Traffic in TSN. In: Proceedings of the IEEE International Conference Industrial Technology: . Paper presented at 25th IEEE International Conference on Industrial Technology, Bristol, England, 25-27th March, 2024. Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Bandwidth Reservation Analysis for Schedulability of AVB Traffic in TSN
Show others...
2024 (English)In: Proceedings of the IEEE International Conference Industrial Technology, Institute of Electrical and Electronics Engineers (IEEE), 2024Conference paper, Published paper (Refereed)
Abstract [en]

In this paper, we present a bandwidth reservation analysis for Audio-Video Bridging (AVB) traffic in the Time-Sensitive Networking (TSN) standards. The proposed analysis is based on the existing worst-case response-time analysis and can be used to calculate the minimum required bandwidth for guaranteeing the schedulability of messages in AVB classes. The proposed analysis allocates per-link bandwidth to AVB traffic that is sufficient to ensure its schedulability when a combination of the Credit-Based Shaper and Time-Aware Shaper mechanisms are used. We evaluate the proposed analysis using an automotive industrial use case. We evaluate the schedulability of AVB traffic by comparing the proposed analysis with the utilization-based bandwidth reservation as recommended by the TSN standards.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
Keywords
Audio videos, Automotives, Bandwidth reservation, Industrial use case, Link bandwidth, Response-time analysis, Schedulability, Worst case response time, Bandwidth
National Category
Computer Systems
Identifiers
urn:nbn:se:mdh:diva-67698 (URN)10.1109/ICIT58233.2024.10540711 (DOI)2-s2.0-85195787047 (Scopus ID)9798350340266 (ISBN)
Conference
25th IEEE International Conference on Industrial Technology, Bristol, England, 25-27th March, 2024
Available from: 2024-06-20 Created: 2024-06-20 Last updated: 2024-06-20Bibliographically approved
Berisa, A., Mubeen, S., Daneshtalab, M., Ashjaei, S. M., Sjödin, M., Kraljusic, B. & Zahirovic, N. (2024). Bridging the Gap: An Interface Architecture for Integrating CAN and TSN Networks.
Open this publication in new window or tab >>Bridging the Gap: An Interface Architecture for Integrating CAN and TSN Networks
Show others...
2024 (English)Report (Other academic)
Series
MRTC Report, Mälardalen Real-Time Research Centre ; 351
National Category
Computer Systems
Identifiers
urn:nbn:se:mdh:diva-68558 (URN)MDH-MRTC-351/2024-1-SE (ISRN)
Available from: 2024-10-02 Created: 2024-10-02 Last updated: 2024-10-02Bibliographically approved
Zoljodi, A., Abadijou, S., Alibeigi, M. & Daneshtalab, M. (2024). Contrastive Learning for Lane Detection via cross-similarity. Pattern Recognition Letters, 185, 175-183
Open this publication in new window or tab >>Contrastive Learning for Lane Detection via cross-similarity
2024 (English)In: Pattern Recognition Letters, ISSN 0167-8655, E-ISSN 1872-7344, Vol. 185, p. 175-183Article in journal (Refereed) Published
Abstract [en]

Detecting lane markings in road scenes poses a significant challenge due to their intricate nature, which is susceptible to unfavorable conditions. While lane markings have strong shape priors, their visibility is easily compromised by varying lighting conditions, adverse weather, occlusions by other vehicles or pedestrians, road plane changes, and fading of colors over time. The detection process is further complicated by the presence of several lane shapes and natural variations, necessitating large amounts of high-quality and diverse data to train a robust lane detection model capable of handling various real-world scenarios. In this paper, we present a novel self-supervised learning method termed Contrastive Learning for Lane Detection via Cross-Similarity (CLLD) to enhance the resilience and effectiveness of lane detection models in real-world scenarios, particularly when the visibility of lane markings are compromised. CLLD introduces a novel contrastive learning (CL) method that assesses the similarity of local features within the global context of the input image. It uses the surrounding information to predict lane markings. This is achieved by integrating local feature contrastive learning with our newly proposed operation, dubbed cross-similarity. The local feature CL concentrates on extracting features from small patches, a necessity for accurately localizing lane segments. Meanwhile, cross-similarity captures global features, enabling the detection of obscured lane segments based on their surroundings. We enhance cross-similarity by randomly masking portions of input images in the process of augmentation. Extensive experiments on TuSimple and CuLane benchmark datasets demonstrate that CLLD consistently outperforms state-of-the-art contrastive learning methods, particularly in visibility-impairing conditions like shadows, while it also delivers comparable results under normal conditions. When compared to supervised learning, CLLD still excels in challenging scenarios such as shadows and crowded scenes, which are common in real-world driving.

Place, publisher, year, edition, pages
Elsevier, 2024
Keywords
Contrastive learning, Convolutional neural networks, Lane detection, Adversarial machine learning, Road and street markings, Self-supervised learning, Semi-supervised learning, Condition, Convolutional neural network, Detection models, Input image, Lane markings, Learning methods, Local feature, Real-world scenario, Shape priors
National Category
Infrastructure Engineering
Identifiers
urn:nbn:se:mdh:diva-68260 (URN)10.1016/j.patrec.2024.08.007 (DOI)001301208800001 ()2-s2.0-85201504705 (Scopus ID)
Available from: 2024-08-28 Created: 2024-08-28 Last updated: 2024-09-11Bibliographically approved
Ahmadilivani, M. H., Mousavi, H., Raik, J., Daneshtalab, M. & Jenihhin, M. (2024). Cost-Effective Fault Tolerance for CNNs Using Parameter Vulnerability Based Hardening and Pruning. In: Proceedings - 2024 IEEE 30th International Symposium on On-line Testing and Robust System Design, IOLTS 2024: . Paper presented at 2024 IEEE 30th International Symposium on On-line Testing and Robust System Design, IOLTS 2024, Rennes, France, 3-5 July, 2024. Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Cost-Effective Fault Tolerance for CNNs Using Parameter Vulnerability Based Hardening and Pruning
Show others...
2024 (English)In: Proceedings - 2024 IEEE 30th International Symposium on On-line Testing and Robust System Design, IOLTS 2024, Institute of Electrical and Electronics Engineers (IEEE), 2024Conference paper, Published paper (Refereed)
Abstract [en]

Convolutional Neural Networks (CNNs) have become integral in safety-critical applications, thus raising concerns about their fault tolerance. Conventional hardwaredependent fault tolerance methods, such as Triple Modular Redundancy (TMR), are computationally expensive, imposing a remarkable overhead on CNNs. Whereas fault tolerance techniques can be applied either at the hardware level or at the model levels, the latter provides more flexibility without sacrificing generality. This paper introduces a model-level hardening approach for CNNs by integrating error correction directly into the neural networks. The approach is hardwareagnostic and does not require any changes to the underlying accelerator device. Analyzing the vulnerability of parameters enables the duplication of selective filters/neurons so that their output channels are effectively corrected with an efficient and robust correction layer. The proposed method demonstrates fault resilience nearly equivalent to TMR-based correction but with significantly reduced overhead. Nevertheless, there exists an inherent overhead to the baseline CNNs. To tackle this issue, a cost-effective parameter vulnerability based pruning technique is proposed that outperforms the conventional pruning method, yielding smaller networks with a negligible accuracy loss. Remarkably, the hardened pruned CNNs perform up to 24% faster than the hardened un-pruned ones.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
Keywords
Error correction, Fault tolerance, Hardening, Multilayer neural networks, Redundancy, Convolutional neural network, Cost effective, Errors correction, Fault resilience, Fault tolerance techniques, Neural-networks, Output channels, Safety critical applications, Selective filters, Triple modular redundancy, Convolutional neural networks
National Category
Computer Engineering
Identifiers
urn:nbn:se:mdh:diva-68261 (URN)10.1109/IOLTS60994.2024.10616072 (DOI)001293143000020 ()2-s2.0-85201385252 (Scopus ID)9798350370553 (ISBN)
Conference
2024 IEEE 30th International Symposium on On-line Testing and Robust System Design, IOLTS 2024, Rennes, France, 3-5 July, 2024
Available from: 2024-08-28 Created: 2024-08-28 Last updated: 2024-10-16Bibliographically approved
Sharifi, A. A., Zoljodi, A. & Daneshtalab, M. (2024). DAT: Deep Learning-Based Acceleration-Aware Trajectory Forecasting. JOURNAL OF IMAGING, 10(12), Article ID 321.
Open this publication in new window or tab >>DAT: Deep Learning-Based Acceleration-Aware Trajectory Forecasting
2024 (English)In: JOURNAL OF IMAGING, ISSN 2313-433X, Vol. 10, no 12, article id 321Article in journal (Refereed) Published
Abstract [en]

As the demand for autonomous driving (AD) systems has increased, the enhancement of their safety has become critically important. A fundamental capability of AD systems is object detection and trajectory forecasting of vehicles and pedestrians around the ego-vehicle, which is essential for preventing potential collisions. This study introduces the Deep learning-based Acceleration-aware Trajectory forecasting (DAT) model, a deep learning-based approach for object detection and trajectory forecasting, utilizing raw sensor measurements. DAT is an end-to-end model that processes sequential sensor data to detect objects and forecasts their future trajectories at each time step. The core innovation of DAT lies in its novel forecasting module, which leverages acceleration data to enhance trajectory forecasting, leading to the consideration of a variety of agent motion models. We propose a robust and innovative method for estimating ground-truth acceleration for objects, along with an object detector that predicts acceleration attributes for each detected object and a novel method for trajectory forecasting. DAT is trained and evaluated on the NuScenes dataset, demonstrating its empirical effectiveness through extensive experiments. The results indicate that DAT significantly surpasses state-of-the-art methods, particularly in enhancing forecasting accuracy for objects exhibiting both linear and nonlinear motion patterns, achieving up to a 2x improvement. This advancement highlights the critical role of incorporating acceleration data into predictive models, representing a substantial step forward in the development of safer autonomous driving systems.

Place, publisher, year, edition, pages
MDPI, 2024
Keywords
end-to-end trajectory forecasting, deep learning, perception, acceleration prediction
National Category
Computer Sciences
Identifiers
urn:nbn:se:mdh:diva-70312 (URN)10.3390/jimaging10120321 (DOI)001386658800001 ()39728218 (PubMedID)2-s2.0-85213432261 (Scopus ID)
Available from: 2025-02-26 Created: 2025-02-26 Last updated: 2025-02-26Bibliographically approved
Lindén, J., Burresi, G., Forsberg, H., Daneshtalab, M. & Söderquist, I. (2024). Enhancing Drone Surveillance with NeRF: Real-World Applications and Simulated Environments. In: 2024 AIAA DATC/IEEE 43rd Digital Avionics Systems Conference (DASC): . Paper presented at 2024 AIAA DATC/IEEE 43rd Digital Avionics Systems Conference (DASC), San Diego, CA, USA, 29/9-3/10, 2024. Institute of Electrical and Electronics Engineers (IEEE), Article ID 204263.
Open this publication in new window or tab >>Enhancing Drone Surveillance with NeRF: Real-World Applications and Simulated Environments
Show others...
2024 (English)In: 2024 AIAA DATC/IEEE 43rd Digital Avionics Systems Conference (DASC), Institute of Electrical and Electronics Engineers (IEEE), 2024, article id 204263Conference paper, Published paper (Refereed)
Abstract [en]

Machine Learning (ML) systems require representative and diverse datasets to accurately learn the objective task. Insupervised learning data needs to be accurately annotated, whichis an expensive and error-prone process. We present a methodfor generating synthetic data tailored to the use-case achievingexcellent performance in a real-world usecase. We provide amethod for producing automatically annotated synthetic visualdata of multirotor unmanned aerial vehicles (UAV) and otherairborne objects in a simulated environment with a high degreeof scene diversity, from collection of 3D models to generation ofannotated synthetic datasets (synthsets). In our data generationframework SynRender we introduce a novel method of usingNeural Radiance Field (NeRF) methods to capture photo-realistichigh-fidelity 3D-models of multirotor UAVs in order to automatedata generation for an object detection task in diverse environments. By producing data tailored to the real-world setting, ourNeRF-derived results show an advantage over generic 3D assetcollection-based methods where the domain gap between thesimulated and real-world is unacceptably large. In the spirit ofkeeping research open and accessible to the research communitywe release our dataset VISER DroneDiversity used in this project,where visual images, annotated boxes, instance segmentation anddepth maps are all generated for each image sample.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
Keywords
datasets, neural networks, synthetic data generation, automatic annotation, dataset generation
National Category
Computer graphics and computer vision
Research subject
Computer Science
Identifiers
urn:nbn:se:mdh:diva-69153 (URN)10.1109/DASC62030.2024.10749011 (DOI)2-s2.0-85211243547 (Scopus ID)9798350349610 (ISBN)
Conference
2024 AIAA DATC/IEEE 43rd Digital Avionics Systems Conference (DASC), San Diego, CA, USA, 29/9-3/10, 2024
Available from: 2024-11-18 Created: 2024-11-18 Last updated: 2025-02-07Bibliographically approved
Organisations

Search in DiVA

Show all publications