حملات تخاصمی در یک مدل تحلیل احساس متن

Fa | Ar | En

حملات تخاصمی در یک مدل تحلیل احساس متن


نویسنده	مکرمی سفیدآب سحر ,میرروشندل ابوالقاسم ,احمدی فر حمیدرضا ,مکرمی سفیداب مهدی
منبع	سامانه هاي پردازشي و ارتباطي چند رسانه اي هوشمند - 1400 - دوره : 2 - شماره : 2 - صفحه:9 -17
چکیده	شبکه‌های عصبی عمیق دقت و کارایی بالایی در حل مسائل مختلف دارند اما در برابر نمونه‌های تخاصمی آسیب پذیر هستند. این‌ دسته از نمونه‌های مخرب به منظور فریب مدل آموزش‌دیده و بررسی آسیب‌پذیری مدل‌های شبکه عصبی تولید می‌شوند. در حوزه متن روش‌های موفق زیادی برای ساخت این‌ نمونه‌ها ارائه نشده است. در این پژوهش یک روش قوی مبتنی بر گرادیان تابع هزینه مدل برای تولید نمونه های تخاصمی متنی ارائه شده و نشان داده شده که می‌توان با جایگزینی تعداد کمی از کلمات موجود در نمونه‌های اصلی با کلماتی که بیشترین تاثیر منفی را روی تصمیم طبقه‌بند دارند، نمونه‌های جدیدی مشابه با نمونه‌های اولیه برای فریب طبقه‌بند تحلیل احساس در سطح کلمه تولید نمود. در نهایت با بهره‌گیری از این نمونه‌ها دقت دو مدل طبقه‌بند از پیش‌آموزش‌دیده بررسی شد. روش مورد استفاده در این پژوهش، با دست‌کاری اندک نمونه‌های ورودی، موفق به کاهش دقت طبقه‌بندی از 86 درصد به کمتر از 10 درصد شده است.
کلیدواژه	حملات متنی، نمونه‌های تخاصمی، گرادیان تابع هزینه، تحلیل احساس، پردازش زبان طبیعی
آدرس	دانشگاه گیلان, دانشکده فنی, ایران, دانشگاه گیلان, دانشکده فنی, ایران, دانشگاه گیلان, دانشکده فنی, ایران, دانشگاه پیام‌نور واحد رشت, ایران
پست الکترونیکی	info.mokarrami@gmail.com

Adversarial Attacks in a text sentiment Analysis model

Authors	Mokarrami Sefidab Sahar ,Mirroshandel Abolghasem ,Ahmadifar hamidreza ,Mokarrami Mehdi
Abstract	Background and Purpose: Recently some researchers have shown that deep learning models, despite their high accuracy, can be vulnerable through some manipulations of their input samples. This manipulation leads to the production of new samples called Adversarial examples. These samples are very similar to the original ones, so humans cannot differentiate between these samples and the original, and cannot remove them from the dataset before predicting the model and preventing model errors. Various types of research have been done to generate malicious samples and inject them into the model, among which, the production of text samples has its own difficulties due to the discrete nature of the text. In this research, we tried to reach the highest level of vulnerability by providing a method with the least manipulation of the input data, and by testing the proposed method, we were able to bring the accuracy of CNN and LSTM models to less than 10%.Methods: In this research, for making malicious samples, first, a word that can increase the amount of error in the classification prediction is selected from the word dictionary as a candidate word for replacement by using Taylor expansion and then considering the importance of each word in the calculated cost of the corresponding candidate word, we proposed an arrangement for substitution between words. Finally, we moved the words in the specified order until the output of the model changed.Results: The evaluation of the presented method on two sentiment analysis models, LSTM and CNN, has shown that the proposed method has been very effective in reducing the accuracy of both models to less than 10% with a small number of replacements and this indicates the success of the proposed method compared to some other similar methods.Conclusion: As mentioned, most of the attention of science and industry is on the production of different systems using deep learning methods, so their security of them is also important. It is important to increase the strength of the models against adversarial examples. In this research, a method with the least amount of manipulation was presented to produce textual conflict samples. It seems that in the future it will be possible to use different methods of making natural texts to produce samples that, in addition to the apparent similarity to the original sample, are also comprehensible in terms of content.
Keywords