Towards Cognizant Hearing Aids: Modeling of Content, Affect and Attention
In IMM-PhD-2012, 2012
Abstract
Hearing aids improved significantly after the integration of advanced digital signal processing applications. This improvement will continue and evolve through obtaining intelligent, individualized hearing aids integrating top-down (knowledge-based) and bottom-up (signal-based) approaches by making use of research done within cognitive science that is the interdisciplinary study of mind and intelligence bringing together various disciplines including Artificial Intelligence, Cognitive Psychology, and Neuroscience. This thesis focuses on three subjects within cognitive science related to hearing. Initially, a novel method for automatic speech recognition using binary features from binary masks, is discussed. The performance of binary features in terms of robustness to noise is compared with the ASR state of the art features, mel frequency cepstral coefficients. Secondly, human top-down auditory attention is studied. A computational top-down attention model is presented and behavioral experiments are carried out to investigate the role of top-down task driven attention in the cocktail party problem. Finally, automatic emotion recognition from speech is studied using a dimensional approach and with a focus of integrating semantic and acoustic features. An emotional speech corpus that consists of short movie clips with audio and text parts, rated by human subjects in two affective dimensions (arousal and valence), is prepared to evaluate the method proposed.