ارائه رویکرد تنسور سه بعدی برای طبقه‌بندی و تشخیص اخبار جعلی: مطالعه موردی اخبار فارسی در حوزه کرونا ویروس

Fa | Ar | En

ارائه رویکرد تنسور سه بعدی برای طبقه‌بندی و تشخیص اخبار جعلی: مطالعه موردی اخبار فارسی در حوزه کرونا ویروس


نویسنده	متقی وحید ,اسماعیلی مهدی ,بازایی قاسمعلی ,افشارکاظمی محمدعلی
منبع	علوم و فنون مديريت اطلاعات - 1400 - دوره : 7 - شماره : 4 - صفحه:221 -250
چکیده	هدف: هدف پژوهش حاضر اختصاص یکی از کلاس‌‌های جعل و واقعی به متن‌‌های آزاد می‌باشد. شبکه‌‌های عصبی کانولوشنی به عنوان یکی از مهم‌‌ترین مدل‌‌های یادگیری عمیق، دقت بالایی را بر روی این مسائل بدست آورده است. در این تحقیق آنالیز متن در سطح جمله و بهبود عملکرد شبکه عصبی کانولوشنی جهت تشخیص اخبار جعلی مورد توجه بوده است. در اﯾﻦ ﺷﺒﮑﻪ‌‌ﻫﺎ ﮐﻠﻤﺎت ﺑﻪ ﺻﻮرت ﮐﯿﺴﻪ‌‌ای از ﮐﻠﻤﺎت ﺑﻪ ﻣﺪل داده ﻣﯽ‌‌ﺷﻮﻧﺪ ﮐﻪ ﻫﺮ ﮐﻠﻤﻪ ﺑﺎ ﺗﻮﺟﻪ ﺑﻪ ﻓﻀﺎی ﺑﺮداری ﺑﻪ ﻣﺎﺗﺮﯾﺲ‌‌ﻫﺎی دو ﺑﻌﺪی ﺗﺒﺪﯾﻞ ﻣﯽ‌‌ﺷود. یکی از محدودیت‌‌های شبکه‌‌های کانولوشن این است که در سطح کلمه کار کرده و نمی‌‌تواند رابطه و فاصله بین جملات را در نظر بگیرد و آﻧﺎﻟﯿﺰ در ﺳﻄﺢ ﺟﻤﻠﻪ مشکل اساسی در این تحقیق می‌‌باشد. در این پژوهش یک مدل پایه‌‌ای مبتنی بر شبکه‌‌های کانولوشنی پیشنهاد شده که در آن اسناد به صورت تنسورهای سه بعدی به شبکه داده می‌‌شوند تا بتواند مشکل مذکور را مرتفع نماید. در نظر گرفتن تنسورهای سه بعدی امکان یادگیری موقعیت کلمات در جمله را برای مدل فراهم می‌‌آورد و به نتایج دقیق‌تری در تشخیص اخبار جعل دست می‌یابد.روش‌‌شناسی: پژوهش حاضر مطالعه‌ای کاربردی بوده که در آن حدود 42000 اخبار فارسی از شهرهای مختلف ایران از توییتر جمع‌‌آوری شده و با عمل پیش‌پردازش، داده‌های اضافی و غیر مفید حذف و پس از برچسب زدن متون پاک‌سازی شده، متن اخبار جهت رویکرد پیشنهادی با استفاده از نرم‌افزار پایتون پردازش شده‌اند.یافته‌‌ها: برخی از الگوریتم‌‌های یادگیری ماشین دارای قدرت بیشتری در مسائل طبقه‌‌بندی بودند، ولی با تغییراتی که در ساختار الگوریتم شبکه کانولوشن صورت گرفت، نتایج بهتری نسبت به الگوریتم‌‌های یادگیری ماشین و سایر الگوریتم‌‌های مشابه حاصل شد.نتیجه‌‌گیری: در نظر گرفتن تنسورهای سه بعدی امکان یادگیری موقعیت کلمات در جمله را برای مدل فراهم می‌آورد و این مدل پیشنهادی در مقایسه با رویکردهای پیشنهادی در ادبیات، دقت قابل توجهی را بدست آورده است. مدل پیشنهادی بدون اضافه کردن سربار اضافی از لحاظ تعداد ویژگی‌ها و عمق شبکه، با تغییر در ورودی توانسته است به نتایج بهتر و قابل قبول از سایر رویکردهای موجود در ادبیات دست یافته و به دقت و صحّت بیش از 94 درصد دست یابد.
کلیدواژه	پردازش زبان طبیعی، طبقه‌بندی متن، شبکه‌های عصبی کانولوشنی، تنسور سه بعدی، اخبار جعلی، اخبار فارسی، کرونا ویروس
آدرس	دانشگاه آزاد اسلامی واحد قشم, گروه مدیریت فناوری اطلاعات, ایران, دانشگاه آزاد اسلامی واحد کاشان, گروه علوم کامپیوتر, ایران, دانشگاه آزاد اسلامی واحد تهران مرکزی, گروه مدیریت, ایران, دانشگاه آزاد اسلامی واحد تهران مرکزی, گروه مدیریت, ایران
پست الکترونیکی	m_afsharkazemi@iauec.ac.ir

Providing a ThreeDimensional Tensor Approach For Classifying and Detecting Fake News A Case Study of Persian Newsin The Field of COVID19

Authors	Mottaghi Vahid ,Esmaeili Mahdi ,Bazaee Ghasem Ali ,Afshar Kazemi Mohammad Ali
Abstract	Purpose: Convolutional neural networks, as one of the most important models of deep learning, have gained high accuracy on these issues. In this study, discussion and text analysis at the sentence level and improving the performance of neural networks to detect fake news has been convolution. The network of words for bags of words in the data model so that each word according to the twodimensional vector space to become matrices. One of the limitations of convolutional networks is that it works at the word level and cannot consider the relationship and distance between sentences. And sentencelevel analysis is a major problem in this research. Sentence level analysis is a major problem in this research.In this research, a basic model based on convolutional networks is proposed in which documents are given to the network in the form of 3D tensors to solve the mentioned problem. Considering 3D tensors allows the model to learn the position of words in a sentence and achieve more accurate results in detecting fake news.Methodology: This study is applied research in which about 42,000 Persian news from different cities of Iran were collected from Twitter and using preprocessing, additional and useless data is deleted and after tagging the deleted texts, the news text is used for the proposed approach using Python software and related libraries.Findings: During testing, some machine learning algorithms had more power in classification problems, but with the changes in the structure of the convolutional network algorithm, better results were obtained than machine learning algorithms and other similar algorithms.Conclusion: Considering 3D tensors allows the model to learn the position of words in a sentence, and this proposed model has gained considerable accuracy compared to the proposed approaches in the literature. The proposed model without adding additional overhead in terms of the number of features and network depth, by changing the input has been able to achieve better and more acceptable results than other approaches in the literature and achieve an accuracy of more than 94%.
Keywords