|
|
nmf-based improvement of dnn and lstm pre-training for speech enhancemet
|
|
|
|
|
نویسنده
|
safari dehnavi razieh ,seyedin sanaz
|
منبع
|
international journal of information and communication technology research - 2023 - دوره : 15 - شماره : 3 - صفحه:53 -65
|
چکیده
|
A novel pre-training method is proposed to improve deep-neural-networks (dnn) and long-short-term-memory (lstm) performance, and reduce the local minimum problem for speech enhancement. we propose initializing the last layer weights of dnn and lstm by non-negative-matrix-factorization (nmf) basis transposed values instead of random weights. due to its ability to extract speech features even in presence of non-stationary noises, nmf is faster and more successful than previous pre-training methods for network convergence. using nmf basis matrix in the first layer along with another pre-training method is also proposed. to achieve better results, we further propose training individual models for each noise type based on a noise classification strategy. the evaluation of the proposed method on timit data shows that it outperforms the baselines significantly in terms of perceptual-evaluation-of-speech-quality (pesq) and other objective measures. our method outperforms the baselines in terms of pesq up to 0.17, with an improvement percentage of 3.4%.
|
کلیدواژه
|
pre-training ,deep neural networks (dnn) ,long short-term memory (lstm) ,non-negative matrix factorization (nmf) ,speech enhancement ,basis matrix ,noise classification
|
آدرس
|
amirkabir university of technology (tehran polytechnic), department of electrical engineering, iran, amirkabir university of technology (tehran polytechnic), department of electrical engineering, iran
|
پست الکترونیکی
|
sseyedin@aut.ac.ir
|
|
|
|
|
|
|
|
|
|
|
|
Authors
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|