mdh.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Sound localization for human interaction in real environment
Mälardalen University, School of Innovation, Design and Engineering.
Mälardalen University, School of Innovation, Design and Engineering.
2011 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

For a robot to succeed at speech recognition, it is advantageous to have a strong and clear signal tointerpret. To facilitate this the robot can steer and aim for the sound source to get a clearer signal, todo this a sound source localization system is required. If the robot turns towards the speaker thisalso gives a more natural feeling when a human interacts with the robot. To determine where thesound source is positioned, an angle relative to the microphone pair is calculated using theinteraural time difference (ITD), which is the difference in time of arrival of the sound between thepair of microphones. To achieve good result the microphone signals needs to be preprocessed andthere are also different algorithms for calculating the time difference which are investigated in thisthesis. The results presented in this work are from tests, with an emphasis on focusing at real-timesystems, involving noisy environment and response time. The results show the complexity of thebalance between computational time and precision.

Abstract [sv]

För att en robot ska lyckas med taleigenkänning, är det fördelaktigt att ha en stark och tydlig signalatt tolka. För att underlätta detta kan roboten styra och rikta in sig mot ljudkällan för att få entydligare signal och för att detta skall vara möjligt krävs ett system för lokalisering av ljudkällan.Om roboten vänder sig mot talaren ger detta även en mer naturlig känsla när en människainteragerar med roboten. För att avgöra var ljudkällan är placerad, beräknas en vinkel i förhållandetill mikrofonparet med hjälp av interaurala tidsskillnaden (ITD), vilket är skillnaden i ankomsttid avljudet mellan mikrofonparet. För att uppnå bra resultat måste mikrofonsignalerna förbehandlas ochdet finns också olika algoritmer för att beräkna tidsskillnaden som undersöks i detta examensarbete.Det resultat som presenteras i detta arbete kommer från tester, med tonvikt på att fokusera pårealtidssystem, som inbegriper bullrig miljö och svarstid. Resultaten visar komplexiteten i balansenmellan beräknings tid och precision.

Place, publisher, year, edition, pages
2011. , 97 p.
Keyword [en]
Cross-correlation, ITD, Fourier transform, Sound source localization
National Category
Computer Science
Identifiers
URN: urn:nbn:se:mdh:diva-12496OAI: oai:DiVA.org:mdh-12496DiVA: diva2:423959
Subject / course
Computer Science
Uppsok
Technology
Supervisors
Examiners
Available from: 2011-08-29 Created: 2011-06-16 Last updated: 2011-08-29Bibliographically approved

Open Access in DiVA

Sound localization for human interaction in real environment(4348 kB)161 downloads
File information
File name FULLTEXT01.pdfFile size 4348 kBChecksum SHA-512
98f6417af0d633cf030d6c517a54bf461fee7820214959d770697af4b456d3abfdd40fb4c8ca83243507b511e2b45da509cd47c3db11eb869d973dc8c41984d4
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Strömberg, RalfSvensson, Stig-Åke
By organisation
School of Innovation, Design and Engineering
Computer Science

Search outside of DiVA

GoogleGoogle Scholar
Total: 161 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 178 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf