مدلی جدید برپایه معماری کدگذار-کدگشا و سازوکار توجه برای خلاصه‌سازی چکیده‌ای خودکار متون

Fa | Ar | En

مدلی جدید برپایه معماری کدگذار-کدگشا و سازوکار توجه برای خلاصه‌سازی چکیده‌ای خودکار متون


نویسنده	علی اکبرپور حسن ,منظوری محمدتقی ,رحمانی امیرمسعود
منبع	فناوري اطلاعات و ارتباطات ايران - 1401 - دوره : 14 - شماره : 51-52 - صفحه:55 -72
چکیده	با گسترش وب و در دسترس قرار گرفتن حجم زیادی از اطلاعات در قالب اسناد متنی‌، توسعه سیستم‌های خودکار خلاصه‌سازی متون به‌عنوان یکی از موضوعات مهم در پردازش زبان‌های طبیعی در مرکز توجه محققان قرار گرفته است. البته با معرفی روش‌های یادگیری عمیق در حوزه پردازش متن، خلاصه‌سازی متون نیز وارد فاز جدیدی از توسعه شده و در سال‌های اخیر نیز استخراج خلاصه‌ چکیده‌ای از متن با پیشرفت قابل‌توجهی مواجه شده است. اما می‌توان ادعا کرد که تاکنون از همه ظرفیت شبکه‌های عمیق برای این هدف استفاده نشده است و نیاز به پیشرفت در این حوزه توامان با در نظر گرفتن ویژگی‌های شناختی همچنان احساس می‌شود. در این راستا، در این مقاله یک مدل دنباله‌ای مجهز به سازوکار توجه کمکی برای خلاصه‌سازی چکیده‌ای متون معرفی شده است که نه‌تنها از ترکیب ویژگی‌های زبانی و بردارهای تعبیه به‌عنوان ورودی مدل یادگیری بهره می‌برد بلکه برخلاف مطالعات پیشین که همواره از سازوکار توجه در بخش کد‌گذار استفاده می‌کردند، از سازوکار توجه کمکی در بخش کدگذار استفاده می‌کند. به کمک سازوکار توجه کمکی معرفی‌شده که از سازوکار ذهن انسان هنگام تولید خلاصه الهام می‌گیرد، بجای اینکه کل متن ورودی کدگذاری شود، تنها قسمت‌های مهم‌تر متن کدگذاری شده و در اختیار کدگشا برای تولید خلاصه قرار می‌گیرند. مدل پیشنهادی همچنین از یک سوئیچ به همراه یک حد آستانه در کدگشا برای غلبه بر مشکل با کلمات نادر بهره می‌برد. مدل پیشنهادی این مقاله روی دو مجموعه داده cnn/daily mail و duc2004 مورد آزمایش قرار گرفت. بر اساس نتایج حاصل از آزمایش‌ها و معیار ارزیابی rouge، مدل پیشنهادی از دقت بالاتری نسبت به سایر روش‌های موجود برای تولید خلاصه چکیده‌ای روی هر دو مجموعه داده برخوردار است.
کلیدواژه	یادگیری عمیق، خلاصه‌سازی چکیده‌ای، معماری کدگذار-کدگشا، سازوکار توجه کمکی، ویژگی‌های زبانی.
آدرس	دانشگاه آزاد اسلامی واحد علوم و تحقیقات, گروه مهندسی کامپیوتر, ایران, دانشگاه صنعتی شریف‌, گروه مهندسی کامپیوتر‌, ایران, استاد دانشگاه آزاد اسلامی واحد علوم و تحقیقات, گروه مهندسی کامپیوتر, ایران
پست الکترونیکی	rahmani@sbriau.ac.ir

a novel model based on encoder-decoder architecture and attention mechanism for automatic abstractive text summarization

Authors	aliakbarpor hasan ,manzouri mohammadtaghi ,rahmani amirmasoud
Abstract	by the extension of the web and the availability of a large amount of textual information, the development of automatic text summarization models as an important aspect of natural language processing has attracted many researchers. however, with the growth of deep learning methods in the field of text processing, text summarization has also entered a new phase of development and abstractive text summarization has experienced significant progress in recent years. even though, it can be claimed that all the potential of deep learning has not been used for this aim and the need for progress in this field, as well as considering the human cognition in creating the summarization model, is still felt. in this regard, an encoderdecoder architecture equipped with auxiliary attention is proposed in this paper which not only used the combination of linguistic features and embedding vectors as the input of the learning model but also despite previous studies that commonly employed the attention mechanism in the decoder, it utilized auxiliary attention mechanism in the encoder to imitate human brain and cognition in summary generation. by the employment of the proposed attention mechanism, only the most important parts of the text rather than the whole input text are encoded and then sent to the decoder to generate the summary. the proposed model also used a switch with a threshold in the decoder to overcome the rare words problem. the proposed model was examined on cnn / daily mail and duc2004 datasets. based on the empirical results and according to the rouge evaluation metric, the proposed model obtained a higher accuracy compared to other existing methods for generating abstractive summaries on both datasets.