https://www.mdu.se/

mdu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Multi-scale Data Fusion and Machine Learning for Vehicle Manoeuvre Classification
Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.ORCID iD: 0000-0003-3802-4721
Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.ORCID iD: 0000-0002-1212-7637
2023 (English)In: ICSET 2023 - 2023 IEEE 13th International Conference on System Engineering and Technology, Proceeding, Institute of Electrical and Electronics Engineers Inc. , 2023, p. 296-301Conference paper, Published paper (Refereed)
Abstract [en]

Vehicle manoeuvre analysis is vital for road safety as it helps understand driver behaviour, traffic flow, and road conditions. However, classifying data from in-vehicle acquisition systems or simulators for manoeuvre recognition is complex, requiring data fusion and machine learning (ML) algorithms. This paper proposes a hybrid approach that combines multivariate multiscale entropy (MMSE) and one-dimensional convolutional neural networks (1D-CNNs). MMSE is utilised for early feature extraction and data fusion, and the extracted features are classified using 1D-CNNs, achieving an impressive 87% test accuracy in multiclass classification. This paper provides insights into improving vehicle manoeuvre classification using advanced ML techniques and data fusion methods to handle complex data sets effectively. Ultimately, this approach can enhance the understanding of driver behaviour, inform policy decisions, and develop more effective strategies to enhance road safety. 

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers Inc. , 2023. p. 296-301
Keywords [en]
Data Extraction, Data Fusion, Multivariate Multiscale Entropy (MMSE), Vehicle Manoeuvre, Accident prevention, Classification (of information), Complex networks, Data mining, Entropy, Extraction, Learning algorithms, Machine learning, Motor transportation, Roads and streets, Driver's behavior, Flow condition, Machine-learning, Multi-scale datum, Multivariate multiscale entropies, Multivariate multiscale entropy, Road safety, Traffic flow, Vehicle maneuver, Vehicles
National Category
Vehicle Engineering
Identifiers
URN: urn:nbn:se:mdh:diva-65013DOI: 10.1109/ICSET59111.2023.10295109Scopus ID: 2-s2.0-85178031651ISBN: 9798350340891 (print)OAI: oai:DiVA.org:mdh-65013DiVA, id: diva2:1819307
Conference
13th IEEE International Conference on System Engineering and Technology, ICSET 2023, Shah Alam, 2 October 2023
Available from: 2023-12-13 Created: 2023-12-13 Last updated: 2024-11-18Bibliographically approved
In thesis
1.
The record could not be found. The reason may be that the record is no longer available or you may have typed in a wrong id in the address field.
2. Enhancing Multimodal Reasoning with Data Alignment and Fusion
Open this publication in new window or tab >>Enhancing Multimodal Reasoning with Data Alignment and Fusion
2024 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]

Multimodal machine learning (MML) significantly transformed the development of artificial intelligence (AI) systems. Instead of working with single source data, it integrated and analysed information from multiple modalities, such as images, audio, text, sensors, and more. The volume of labelled and unlabelled multimodal data increased rapidly, but effectively using them, especially managing unlabelled multimodal data, poses significant challenges. Existing approaches usually depend on supervised learning and struggle to handle the heterogeneity and complexity of such data. These limitations interrupted the creation of good, scalable, and generalised MML systems that can use the full potential of this diverse data.

This thesis addressed this demanding challenge by using multimodal reasoning. To this end, a scheme was introduced for advancing multimodal reasoning by effectively using unlabelled multimodal data. The scheme was designed on inferential steps to use the latent knowledge and patterns hidden within these vast unlabelled datasets. These inferential steps mitigated the limitation of supervised methods, which solely depend on a vast amount of labelled data, which is difficult to get in real-world scenarios. The selection of unique inferential steps was based on their specific strengths in addressing challenges in unlabelled multimodal data. The scheme starts with using the unsupervised approach to extract features, which are then used as input for a clustering approach to group similar data points based on their hidden characteristics. This clustering approach sets the stage for applying a semi-supervised approach to intelligently assign labels to the clustered data, efficiently converting unlabelled data into a useful and structured resource. 

The validity of the proposed approach is carefully evaluated on unlabelled vehicular datasets collected in real time. The proposed approach showed the ability to achieve more than 90% accuracy by using a newly labelled dataset. Furthermore, this research dove into the exciting field of transfer learning. It explored its potential to enhance multimodal reasoning by using knowledge gained from one dataset to improve performance on another. A novel model based on the transformer architecture is specifically designed to handle continuous features available in multimodal data. The result of the model was satisfactory and showed that the performance of the state-of-the-art was better than traditional machine learning (ML) algorithms.

This thesis research made significant and multifaceted contributions to the research on MML. It provided an extensive analysis of MML and its challenges, including existing approaches on alignment and fusion, by focusing on their limitations and identifying gaps in current research. Moreover, it introduced an effective approach for labelling unlabelled datasets through a series of carefully designed inferential steps, which shows a path for more efficient and scalable multimodal learning. Finally, it presented the outstanding potential of transfer learning, particularly with a transformer-based model, to advance multimodal reasoning. The insights, techniques, and results presented in this thesis held the potential to reveal a new edge in MML research and provide an opportunity to develop more useful, scalable, and data-efficient models to tackle real-world challenges across a wide range of applications.

Place, publisher, year, edition, pages
Västerås: Mälardalen University, 2024. p. 336
Series
Mälardalen University Press Licentiate Theses, ISSN 1651-9256 ; 367
Keywords
Multimodal Machine Learning, Transfer Learning, Multimodal Reasoning, Data Alignment, Data Fusion
National Category
Computer and Information Sciences
Research subject
Computer Science
Identifiers
urn:nbn:se:mdh:diva-69156 (URN)978-91-7485-691-0 (ISBN)
Presentation
2025-01-13, Kappa och digitalt via Zoom, Mälardalens Universitet, Västerås, 09:00 (English)
Opponent
Supervisors
Projects
FitDrive
Funder
EU, Horizon 2020, 953432
Available from: 2024-11-18 Created: 2024-11-18 Last updated: 2024-12-23Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Barua, ArnabAhmed, Mobyen UddinBegum, Shahina

Search in DiVA

By author/editor
Barua, ArnabAhmed, Mobyen UddinBegum, Shahina
By organisation
Embedded Systems
Vehicle Engineering

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 80 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf