An automatic speech detection architecture for social robot oral interaction – RAPP paper at Audio Mostly 2015
RAPP partners will present a paper entitled “An automatic speech detection architecture for social robot oral interaction” at the Audio Mostly event to be held in Thessaloniki on October 7-9, 2015.
Abstract: Social robotics have become a trend in contemporary robotics research, since they can be successfully used in a wide range of applications. One of the most fundamental communication skills a robot must have is the oral interaction with a human, in order to provide feedback or accept commands. And, although text-to-speech is an almost solved problem, this isn’t the case for speech detection, since it includes a large number of different conditions, many of which are literally unpredictable. There are quite a few well established ASR (Automatic Speech Recognition) tools, however without providing efficient results, especially in less popular languages. The current paper investigates different speech detection strategies via the utilization of the Sphinx-4 open-source library. The first is a way to incorporate languages for which no acoustic or language model exists (Greek in our case), following the grapheme-to-phoneme concept. The speech detection model is evaluated using audio captured from a NAO v4 robot, a difficult task due to the high levels of included noise, thus denoising techniques are investigated as well.
This paper has been written in the framework of the RAPP project by Manos Tsardoulias, Andreas Symeonidis and Pericles Mitkas from the Informatics and Telematics Institute (ITI) of CERTH, in Greece.
The Audio Mostly 2015 event invites its participants to explore the unexploited potential of audio in computer-based environments, for example in game contexts, and aims to help open up this area of thinking by bringing together game designers, audio experts, content creators, and technology and behavioral researchers. Through this forum, varied experts will discuss developments and new potentials for audio in many areas such as entertainment, health and fitness, education, industrial training, serious gaming, and much more.
For more information, please visit the Audio Mostly event website.
You can also follow the event on Twitter with the hashtag #AudioMostly.