|
|
an empirical analysis of traditional recognition methods using examples of identifying words spoken by native speakers
|
|
|
|
|
نویسنده
|
ismailov elchin
|
منبع
|
problems of information society - 2025 - دوره : 16 - شماره : 1 - صفحه:68 -74
|
چکیده
|
Many users now interact with a form of artificial intelligence on a daily basis through search engines, social media, and voice recognition software. as the field matures, it is likely to permeate our lives in ever more surprising ways, so it will be important to create new governance structures to ensure its fair and transparent use. along with machine vision algorithms for processing photo and video information, as well as natural language techniques for semantic analysis of texts, working with audio information is also the most demanded procedure for conducting business analytics. the article considers the problem of speech signal recognition using the example of an audio database formed on the basis of words reproduced by a native speaker in different tonalities with his characteristic pronunciation. in the proposed approach, the sound signal is considered as a one-dimensional representation of sound wave oscillations with a certain sampling frequency. to implement the task, classical dtw and ddtw methods, as well as methods based on the fourier transform, discrete and continuous wavelet transforms are used. a computational experiment with the recognition of speech signals reproduced in the azerbaijani language revealed the advantages of the continuous wavelet transform as the most accurate recognition method in the context of solving the problem.
|
کلیدواژه
|
signal recognition ,recognition method ,audio database ,sound recording ,adequacy criteria ,distance metric ,pairwise comparison of signals
|
آدرس
|
institute of control systems, azerbaijan
|
پست الکترونیکی
|
elchin.ismayilov1@mail.ru
|
|
|
|
|
|
|
|
|
|
|
|
Authors
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|