یک سامانه پیشنهادگر محتوا-مشارکتی مبتنی بر خوشه‌بندی و هستان‌شناسی

Fa | Ar | En

یک سامانه پیشنهادگر محتوا-مشارکتی مبتنی بر خوشه‌بندی و هستان‌شناسی


نویسنده	بحرانی پیام ,مینایی بیدگلی بهروز ,پروین حمید ,میرزارضایی میترا ,کشاورز احمد
منبع	پردازش علائم و داده ها - 1402 - شماره : 3 - صفحه:197 -224
چکیده	سامانه‌های پیشنهادگر سامانه‌هایی هستند که در گذر زمان یاد می‌گیرند که هر فرد یا مشتری احتمالاً چه کالا یا قلمی را می‌پسندد و آن را به او پیشنهاد می‌دهند. این سامانه‌ها اغلب بر اساس رفتارهای مشابه از دیگر افراد (احتمالاً مشابه) عمل می‌کنند. به‌طور کلی یافتن افراد مشابه، به علت زیاد بودن کاربران، فرایندی بسیار زمان‌بر و به علت کمبود اطلاعات، نادقیق است. به همین دلیل برخی از روش ها، رو به افزایش سرعت آورده‌اند. از طرفی، برخی از روش های دیگر، رو به افزودن اطلاعاتِ اضافه آورده تا در گذر این اطلاعات بتوانند دقت یافتن کاربران مشابه یا همسایه را افزایش دهند. برخی دیگر نیز، به روش های ترکیبی رو آورده‌اند. اخیراً محققان با به‌کارگیری روش های خوشه بندی پایه که بر اساس یافتن شبیه‌ترین کاربران همسایه با کمک خوشه بندی کاربران می‌باشد، و همچنین استفاده از روش‌های محتوا پایه و بعضاً اضافه نمودن هستان‌شناسی به روش‌های محتوا پایه توانسته‌اند با بهره‌گیری از مزایای این روش‌ها، برخی از چالش‌های فوق را تا حد قابل قبولی حل نمایند. در سامانه پیشنهادگر ترکیبی پیشنهادی، از یک سامانه دو مرحله‌ای استفاده کرده‌ایم که در مرحله اول، دو مدل پیش‌بینی‌های خود را انجام داده، سپس در مرحله دوم به‌وسیله یک مولفه ترکیب‌گر، نتایج دو بخش مرحله اول با یکدیگر ترکیب شده و نتایج به‌دست آمده را به‌عنوان نتایج نهایی سامانه به ما ارائه می‌دهد. در بخش اول، یک سامانه مبتنی بر پر‌کردن مقادیر گم شده، مقادیر خالی در ماتریس امتیازدهی را پر می‌کند. برای این مهم، از بین روش‌های پرکردن داده‌های گم شده، یک روش که با پرکردن مجموعه داده در شرایط بسیار تُنُک سازگار بود را طراحی کرده و سپس آن را به روش خودمان تعمیم داده‌ایم. در این راستا یک روش مبتنی بر خوشه‌بندی فاصله‌گری ارائه کرده‌ایم. در بخش دوم که خود یک سامانه پیشنهادگر ترکیبی هستان‌شناسی پایه می‌باشد، ابتدا به کمک یک خزنده وب، اطلاعات هر قلم را استخراج کرده، سپس در یک هستان‌شناسی پایه به کمک یک روش پیشنهادی، اقدام به بهبود ساختار هستان‌شناسی به‌وسیله حذف یال‌های همسان می‌نماییم. بدین ترتیب دقت اندازه‌گیری شباهت معنایی بین اقلام و کاربران در مراحل بعدی افزایش یافته و میزان اثربخشی پیشنهادات ارائه شده به‌طور با‌معنایی بهبود می‌یابد. شایان ذکر است این هستان‌شناسی یک هستان‌شناسی جامع نیست. درنهایت به کمک یک روش اندازه‌گیری شباهت ابتکاریِ هستان‌شناسی پایه، مشابهت قلم-قلم‌ها، کاربر-کاربرها، و کاربر-قلم‌ها را اندازه‌گیری می‌کنیم. به کمک این ماتریس مشابهت، کاربرها و قلم‌ها را خوشه‌بندی کرده و سپس برای هر کاربر، کاربرها و قلم‌های شبیه به آن را به‌عنوان یک ویژگی جدید در پروفایل کاربر ذخیره می‌نماییم. این کار به ما کمک می‌کند که در آینده، سرعت یافتن کاربرهای مشابه و قلم‌های مشابه را بالا ببریم. در حقیقت بر اساس این ویژگی، سرعت کل کار را افزایش داده‌ایم. از آنجایی که ما هدف خود را ساختن سامانه‌ای که یک موازنه بین دو معیار دقت و سرعت را برقرار کند قرار داده‌ایم، با استفاده از یک مجموعه داده واقعی، از این دو معیار جهت ارزیابی سامانه پیشنهادی استفاده می‌کنیم. نتایج مقایسه‌ی روش پیشنهادی ما با برخی روش‌های مشابه به‌روز ارائه شده در این حوزه (با استفاده از یک مجموعه داده یکسان) حاکی از آن است که روش ما از روش‌های سریع، کندتر است، اما از آنها دقیق‌تر می‌باشد. همچنین این نتایج بیانگر این موضوع است که روش پیشنهادی از روش‌های دقیق، سریع‌تر و کیفیت آن نیز قابل رقابت و یا حتی بهتر است.
کلیدواژه	سامانه پیشنهادگر، هستان‌شناسی، پالایش حافظه پایه، پالایش مدل پایه، خوشه‌بندی، k-nn
آدرس	دانشگاه آزاد اسلامی واحد علوم و تحقیقات تهران, گروه مهندسی کامپیوتر, ایران, دانشگاه علم و صنعت ایران, دانشکده مهندسی کامپیوتر, ایران, دانشگاه آزاد اسلامی واحد نورآباد ممسنی, گروه مهندسی کامپیوتر, ایران. دانشگاه آزاد اسلامی واحد یاسوج, باشگاه پژوهشگران جوان و نخبگان, ایران, دانشگاه آزاد اسلامی واحد علوم و تحقیقات تهران, گروه مهندسی برق, ایران, دانشگاه خلیج فارس, دانشکده مهندسی سیستم‌های هوشمند و علوم داده, گروه مهندسی برق, ایران
پست الکترونیکی	a.keshavarz@pgu.ac.ir

a content-collaborative recommender system based on clustering and ontology

Authors	bahrani payam ,minaei- bidgoli behrouz ,parvin hamid ,mirzarezaee mitra ,keshavarz ahmad
Abstract	recommender systems are systems that, over time, learn what product(s) or item(s) each person or customer is (are) likely to like and recommend it (them) to him/her. these systems often operate based on similar behaviors from other (possibly similar) people. finding similar people is generally a highly time-consuming process due to the large number of users and inaccurate due to the lack of information. for this reason, some methods have resorted to increasing speed. on the other hand, some other methods have added additional information so that they can increase the accuracy of finding similar or neighboring users. some others have resorted to hybrid methods. recently, by the use of basic clustering methods, which is based on finding the most similar neighbors with the help of users’ clustering, as well as by using basic content analysis methods and sometimes adding ontology to these methods, researchers have been able to take the advantage of these methods in order to solve some of the above challenges acceptably. in the proposed hybrid recommender system, we have used a two-stage system in which, in the first stage, two models of predictions are made, then in the second stage, by a combining component, the results of the first two parts are combined and the obtained results are given to us as the final results of the system. in the first part, a system based on imputation of missing values fills in the blanks in the scoring matrix. for this end, among the methods of the missing data imputation, we designed a method that was compatible with filling the data set in very sparse conditions, and then generalized it to our own method. in this regard, we have proposed a method based on the grey distance clustering. in the second part, which itself is a hybrid ontology-based recommender system, we first extract the information of each item with the help of a web crawler, then based on a basic article, we produce our own limited ontology, and after that we apply our proposed method. then, with the help of a proposed method, we improve the ontology structure, thus increasing the accuracy of measuring semantic similarity between the items and users in later stages, and significantly improving the effectiveness of the created recommendations. it should be noted that this ontology is not comprehensive. finally, we measure the similarity of item-items, user-users, and user-items using an innovative basic ontology similarity measurement method. by the use of this similarity matrix, we cluster users and items, and then store similar users and items as a new feature in the user/item profile for each user/item. this will help us speed up the process of looking for similar users and similar items in the future. in fact, based on this feature, we have increased the speed of the whole work. since we have set our goal to build a system that makes a balance between the two criteria of accuracy and speed, we use these two criteria to evaluate the proposed system using a real data set. the results of comparing our proposed method with some up-to-date similar methods presented in this field (using the same data set) implies that our method is slower than fast methods, although it is more accurate than them; for example recall of the fastest method with 0.11 second per prediction is 0.30 while our method consumed time is 0.40 and its recall is 0.80. these results also suggest that the proposed method is faster than accurate methods and its quality is more competitive or even better than them; for example the otopn consumes about 0.58 seconds and has a recall of 0.65.
Keywords	recommender system ,ontology ,memory-based filtering ,model-based filtering ,clustering ,k-nn