|
|
language recognition by convolutional neural networks
|
|
|
|
|
نویسنده
|
khosravani pour l. ,farrokhi a.
|
منبع
|
scientia iranica - 2023 - دوره : 30 - شماره : 1-D - صفحه:116 -123
|
چکیده
|
Speech recognition and in other word communication between computers and human as a sub field of computational linguistics or natural language processing (nlp) has a long history. asr (automatic speech recognition), tts (text to speech), stt (speech to text), csr (continuous speech recognition), ivr (interactive voice response) systems are different approaches to solve problems in this area. hybrid deep neural network (dnn) - hidden markov model (hmm) has been shown to significantly improve speech recognition performance over the conventional gmm-hmm. the performance improvement is partially attributed to the ability of the dnn to model complex correlations in speech features. in this paper, we show that extracting prosodic features for persian language (farsi) can be obtained by using cnns for segmentation and labeling speech for short texts. by using 128 and 200 filters for cnn and special architecture we reach 19.46 error in detection rate and also better time consumption in comparison with rnns. one other advantages of using cnn is simplification of learning procedure. experimental results show that cnn networks can be a good feature extractor for speech recognition in farsi or other languages.
|
کلیدواژه
|
speech segmentation ,convolutional neural networks ,persian language csr ,deep neural network ,gaussian mixture model
|
آدرس
|
islamic azad university, south tehran branch, department of electrical engineering, iran, islamic azad university, south tehran branch, department of electrical engineering, iran
|
پست الکترونیکی
|
sheredko.s@yahoo.com
|
|
|
|
|
|
|
|
|
|
|
|
Authors
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|