当前位置: 首页 >> 科学研究 >> 学术报告 >> 正文

2018年学术报告通知(七)Kit Yan Chan:Neural Network for Speech Enhancement

2018年05月18日 00:00    贾周圣      点击:[]

 

Presentation title: Neural Network for Speech Enhancement

Presenter: Kit Yan Chan (from Curtin University, Australia)

Presentation time: 2.00 pm, 18 May 2018 (Friday)

Presentation venue: A408

Audiences: academic staff, research students, final year undergraduate students

Brief CV for the presenter: Kit Yan Chan received the Ph.D. degree in computing from London South Bank University, London, U.K., in 2006. He is currently a Senior Lecturer with the Department of Electrical and Computer Engineering, Curtin University, Perth, WA, Australia. His current research interest includes machine learning applications. Dr. Chan was a Guest Editor of Applied Soft Computing journal, Neurocomputing journal, the journal of Engineering Applications of Artificial Intelligence, the International Journal of Fuzzy Systems, and the Journal of Engineering Design. He is an Associate Editor of Neurocomputing and the International Journal of Fuzzy Systems, and the Editorial Board of Journal of Engineering Design.

Abstract: To improve quality of noisy speech, the commonly used approaches require the knowledge of the probability density functions for both the speech and noise with respect to the domain of shorttime Fourier transform and signal to noise ratio (SNR). However, both probability density functions are not easier to obtain accurately, since both the speech and noise are time-varying and nonstationary in the real world. To avoid the estimations which are not the trivial task, we propose a multineural network which attempts to perform the speech enhancement. The multi-neural network consists of a set of neural networks of which each of them enhance a particular critical band and a particular SNR. The critical band is considered since it matches human hearing. Each network in the multi-neural network simulates a gain function which attempts to match human hearing on a critical band and a particular SNR. The multi-neural network is trained based on a speech signal which is contaminated with the pink noise, where the network input is the contaminated signal and the network output is the clean signal. By doing so, the network can learn how a clean speech can be generated when a noisy speech is received. The speech enhancement capability of the multi-neural network is evaluated based on a set of 28 noisy speech signals. The speech enhancement performance of the multi-neural network is compared with the commonly used speech enhancement approach. The speech enhancement performance is indicated based on several speech quality metrics namely noise reduction ratio, intelligibility frequency weighted segmental SNR, perceptual evaluation of speech quality and short-time objective intelligibility. Results show that the proposed multi-neural network is comparable to the commonly used method. Some future research directions are suggested to improve the currently used networks for speech enhancement. (The content of the presentation can be referred to Conference Proceedings of APSIPA, pp. 1300-1303, 2017)

  电气工程学院2018年新增博士生指导教师申请材料... 返回目录 2018年学术报告通知(十二)袁媛:如何进行工业化...