Active Research Projects in Speech, Hearing & Phonetic SciencesResearch ProjectsCentre for Law Enforcement Audio Research (CLEAR)The CLEAR project aims to create a centre of excellence in tools and techniques for the cleaning of poor-quality audio recordings of speech. The centre is initially funded by the U.K. Home Office for a period of five years and will be run in collaboration with the Department of Electrical and Electronic Engineering at Imperial College. Dates:2007-2012. Funded by:U.K. Home Office. Duration:5 years. Researchers: Mark Huckvale, Gaston Hilkhuysen, KLAIR - a virtual infantThe KLAIR project aims to build and develop a computational platform to assist research into the acquisition of spoken language. The main part of KLAIR is a sensori-motor server that supplies a client with a virtual infant on screen that can see, hear and speak. The client can monitor the audio visual input to the server and can send articulatory gestures to the head for it to speak through an articulatory synthesizer. The client can also control the position of the head and the eyes as well as setting facial expressions. By encapsulating the real-time complexities of audio and video processing within a server that will run on a modern PC, we hope that KLAIR will encourage and facilitate more experimental research into spoken language acquisition through interaction. Dates:2009. Researchers: Mark Huckvale, Performance-based measures of speech qualityThis project seeks to design and test new methods for the evaluation of speech communication systems. The area of application is for systems which operate at high levels of speech intelligibility or for systems which make little change to intelligibility (such as noise-reduction systems). Conventional intelligibility testing is not appropriate in these circumstances, and existing measures of speech quality are based on subjective opinion rather than speech communication performance. Dates:2010-2013. Funded by:Research in Motion. Duration:3 years. Researchers: Mark Huckvale, Gaston Hilkhuysen, Mark Wibrow, Quantitative modeling of tone and intonationwith Santitham Prom-on and Bundit Thipakorn, King Mongkut's University of Technology Thonburi, Thailand. To develop a quantitative Target Approximation (qTA) model for simulating F0 contours of speech. Following the articulatory-functional framework of the PENTA model (Xu, 2005), the qTA model simulates the production of tone and intonation as a process of syllable-synchronized sequential target approximation. In the model, tone and intonation are treated as communicative functions that directly specify the parameters of the qTA model. The numerical values of the qTA will be extracted from natural speech via supervised learning. And the quality of the modeling output will be both numerically assessed and perceptually evaluated. Dates:2005. Funded by:Collaborative. Researchers: Yi Xu, Speech Perception and Language Acquisition in Children with Hearing ImpairmentsHow do children with hearing aids and cochlear implants learn their native language? Can they use the same learning mechanisms and acoustic cues as their normal-hearing peers? In this study, we examine what children with hearing impairments know about the sound structure of their native language. We are also interested in finding out how they acquire this knowledge and whether it is correlated with their vocabulary and grammar skills. Dates:2009-2012. Researchers: Katrin Skoruppa, Stuart Rosen, Spoken Language Conversion with Accent MorphingSpoken language conversion is the challenge of using synthesis systems to generate utterances in the voice of a speaker but in a language unknown to the speaker. Previous approaches have been based on voice conversion and voice adaptation technologies applied to the output of a foreign language TTS system. This inevitably reduces the quality and intelligibility of the output, since the source speaker will not be a good source of phonetic material in the new language. Our work contrasts previous work with a new approach that uses two synthesis systems: one in the source speaker's voice, one in the voice of a native speaker of the target language. Audio morphing technology is then exploited to correct the foreign accent of the source speaker, while at the same time trying to maintain his or her identity. In this project we aim to construct a spoken language conversion system using accent morphing and evaluate its performance in terms of intelligibility and speaker identity. Dates:2006-. Researchers: Mark Huckvale, The size code in the expression of anger and joy in speechWith Suthathip Chuenwattanapranithi, King Mongkut's University of Technology Thonburi, Thailand To test the "size code" hypothesis for encoding anger and joy in speech. According to the hypothesis, these two emotions are conveyed in speech by exaggerating or understating the body size of the speaker, just as nonhuman animals exaggerate or understate their body size to communicate threat or appeasement. We will conduct acoustic analysis of publicly available emotional speech databases, and synthesize Thai vowels with a 3D articulatory synthesizer using parameter manipulations suggested by the size code hypothesis, and asked Thai listeners to judge the body size and emotion of the speaker. Initial results are in support of the size code hypothesis. Dates:2005. Funded by:Collaborative. Researchers: Yi Xu,
List of Completed Research Projects | |