https://www.mdu.se/

mdu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
A Keyword Based Interactive Speech Recognition System for Embedded Applications
Mälardalen University, School of Innovation, Design and Engineering.
Mälardalen University, School of Innovation, Design and Engineering.
2011 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

Speech recognition has been an important area of research during the past decades. The usage of automatic speech recognition systems is rapidly increasing among different areas, such as mobile telephony, automotive, healthcare, robotics and more. However, despite the existence of many speech recognition systems, most of them use platform specific and non-publicly available software. Nevertheless, it is possible to develop speech recognition systems using already existing open source technology.

The aim of this master's thesis is to develop an interactive and speaker independent speech recognition system. The system shall be able to identify predetermined keywords from incoming live speech and in response, play audio files with related information. Moreover, the system shall be able to provide a response even if no keyword was identified. For this project, the system was implemented using PocketSphinx, a speech recognition library, part of the open source Sphinx technology by the Carnegie Mellon University.

During the implementation of this project, the automation of different steps of the process, was a key factor for a successful completion. This automation consisted on the development of different tools for the creation of the language model and the dictionary, two important components of the system. Similarly, the audio files to be played after identifying a keyword, as well as the evaluation of the system's performance, were fully automated.

The tests run show encouraging results and demonstrate that the system is a feasible solution that could be implemented and tested in a real embedded application. Despite the good results, possible improvements can be implemented, such as the creation of a different phonetic dictionary to support different languages.

Place, publisher, year, edition, pages
2011. , p. 87
Keywords [en]
Automatic Speech Recognition, PocketSphinx, Embedded Systems
Identifiers
URN: urn:nbn:se:mdh:diva-12479OAI: oai:DiVA.org:mdh-12479DiVA, id: diva2:423640
Uppsok
Technology
Supervisors
Examiners
Available from: 2011-06-20 Created: 2011-06-15 Last updated: 2011-06-20Bibliographically approved

Open Access in DiVA

fulltext(2585 kB)387 downloads
File information
File name FULLTEXT01.pdfFile size 2585 kBChecksum SHA-512
a42f7ecf64f86fed07b53a60eb14988bd5272b6cb2d680d54d71f1703932566ad6622d5c908b9cad81a0e50562fb76a720098ada2aed6c8641bf2dee4830985d
Type fulltextMimetype application/pdf

By organisation
School of Innovation, Design and Engineering

Search outside of DiVA

GoogleGoogle Scholar
Total: 493 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 944 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf