یک مدل آماری جهت ارزیابی سامانه‌های پرسش و پاسخ تعاملی با استفاده از رگرسیون

Fa | Ar | En

یک مدل آماری جهت ارزیابی سامانه‌های پرسش و پاسخ تعاملی با استفاده از رگرسیون


نویسنده	حسینی محمدمهدی ,زاهدی مرتضی ,حسن پور حمید
منبع	پردازش علائم و داده ها - 1398 - شماره : 3 - صفحه:37 -48
چکیده	همانند بسیاری از زمینه‌ های دیگر زبان‌ شناسی محاسباتی، ارزیابی نقش مهمی در سامانه‌ های پرسش و پاسخ تعاملی ایفا می‌کند. با این وجود، در زمینه ارزیابی سامانه‌های پرسش و پاسخ تعاملی به‌طورتقریبی هیچ روش خاصی وجود ندارد که به ارزیابی کلی این سامانه‌ ها پرداخته و همواره انسان باید در فرآیند ارزیابی مشارکت داشته باشد. ارائه مدلی که بتواند جایگزین انسان در فرآیند ارزیابی شود، یکی از موضوعات مورد توجه در این حوزه است. در این مقاله، یک مدل آماری مناسب برای ارزیابی سامانه‌های پرسش و پاسخ تعاملی جهت جایگزین‌کردن به جای انسان توسط مجموعه‌ای از ویژگی‌های جدید و رگرسیون ارائه شده است. با استفاده از چهار سامانه تعاملی موجود پایگاه داده‌ای مناسب ایجاد شد. تعداد 540 نمونه به‌عنوان داده مناسب در نظر گرفته شد تا مجموعه آزمون و آموزش بر اساس آن تشکیل شود. ابتدا پیش‌پردازش بر روی مکالمات صورت پذیرفت و بر اساس روابط تعریف‌شده، ویژگی‌های آماری از متن مکالمه‌ها استخراج و بر اساس آن ماتریس ویژگی تشکیل و سپس با استفاده از انواع رگرسیون سعی شد تا بهترین مدل استخراج شود که در‌نهایت رگرسیون غیرخطی سری توانی با rmse به میزان 0.13 بهترین مدل را ارائه کرد.
کلیدواژه	ارزیابی، سامانه پرسش و پاسخ تعاملی، رگرسیون غیرخطی، استخراج ویژگی
آدرس	دانشگاه آزاد اسلامی واحد شاهرود, دانشکده مهندسی برق و کامپیوتر, ایران, دانشگاه صنعتی شاهرود, دانشکده مهندسی کامپیوتر و فن‌آوری اطلاعات, ایران, دانشگاه صنعتی شاهرود, دانشکده مهندسی کامپیوتر و فن‌آوری اطلاعات, ایران
پست الکترونیکی	h.hassanpour@shahroodut.ac.ir

A New Statistical Model for Evaluation Interactive Question Answering Systems Using Regression

Authors	hosseini mohammad mehdi ,zahedi Morteza ,Hassanpour hamid
Abstract	The development of computer systems and extensive use of information technology in the everyday life of people have just made it more and more important for them to make quick access to information that has received great importance. Increasing the volume of information makes it difficult to manage or control. Thus, some instruments need to be provided to use this information. The QA system is an automated system for obtaining the correct answers to questions posed by the human in the natural language. In these systems, if the response is found, and if it is not the user's expected response or if it needs more information, there is no possibility of exchanging information between the system and the user to ask more questions and get answers related to it. To solve this problem, interactive Question answering (IQA) systems were created. Interactive question answering (IQA) systems are associated with linguistic ambiguous structures, so these systems are more accurate than QA systems. Regarding the probability of ambiguity (ambiguity in the user question or ambiguity in the answer provided by the system), the repetition is possible in these systems to obtain the clarity. No standard methods have been developed on IQA systems evaluation, and the existing evaluation methods have been developed based on the methods used in QA and dialogue systems. In evaluating IQA systems, in addition to quantitative evaluation, a qualitative evaluation is used. It requires users rsquo; participation in the evaluation process to determine the success level of interaction between the system and the user. Evaluation plays an important role in the IQA systems. In the context of evaluating IQA systems, there is partially no specific methodology for evaluating these systems in general. The main problem with designing an assessment method for IQA systems lies in the rare possibility to predict the interaction part. To this end, human needs to be involved in the evaluation process. In this paper, an appropriate model is presented by introducing a set of builtin features for evaluating IQA systems. To conduct the evaluation process, four IQA systems were considered based on the conversation exchanged between users and systems. Moreover, 540 samples were considered as suitable data to create a test and training set. The statistical characteristics of each conversation were extracted after performing the preprocessing on them. Then a feature matrix was formed based on the obtained characteristics. Finally, using linear and nonlinear regression, human thinking was predicted. As a result, the nonlinear power regression with 0.13 Root Mean Square Error (RMSE) was the best model.
Keywords	Evaluation ,Interactive Question Answering Systems ,Nonlinear Regression ,Feature Extraction