انتخاب ویژگی غیرنظارتی مقیاس پذیر توسط یادگیری ماتریس و تئوری گراف دوقسمته

Fa | Ar | En

انتخاب ویژگی غیرنظارتی مقیاس پذیر توسط یادگیری ماتریس و تئوری گراف دوقسمته


نویسنده	صالح نژاد کوثر ,دانشپور نگین
منبع	مهندسي برق و الكترونيك ايران - 1402 - دوره : 20 - شماره : 3 - صفحه:135 -148
چکیده	با گسترش سریع تکنولوژی، حجم عظیمی از داده های بدون برچسب با ابعاد زیاد، نیاز به پردازش پیدا کردند. برای کاهش ابعاد، انتخاب ویژگی غیرنظارتی، به عنوان یک پیش مرحله مهم قبل از وظایف یادگیری ماشین، شناخته می شود. در این مقاله، یک روش انتخاب ویژگی غیرنظارتی پیشنهاد می شود. روش مذکور بر اساس گراف ماتریس و ماتریس وزنی، به صورت پویا و مقیاس پذیر عمل می کند. برای بهبود عمکرد این روش، به جای استفاده از تابع لاگرانژ در ساخت ماتریس وزنی، تئوری گراف دو قسمته اعمال می شود. انتخاب ویژگی روی گراف ماتریس انجام می شود. این گراف با به کارگیری k نزدیک ترین همسایه ساخته می شود، که روش را نسبت به نویز مقاوم تر می کند. همچنین ساختار سراسری داده ی اصلی، از طریق ساخت ماتریس وزن بازسازی شده با کمک محدودیت رتبه پایین، حفظ می شود. علاوه براین، نمره ی ویژگی، که به طور صریح قدرت مندی ویژگی ها را منعکس می کند، با کمک تابع frobenius norm مدل می شود. روش پیشنهادی با روش های مشابه در سه معیار دقت کلاس بندی، حساسیت به پارامتر و پیچیدگی زمانی مقایسه شده است. آزمایش ها نشان می دهد که دقت کلاس بندی روش ارائه شده ی این مقاله، به طور متوسط 2.83% بهبود یافته است. همچنین پیچیدگی زمانی آن تا max{o(n^2d),o(nm)} کاهش یافته است، که n تعداد نمونه ها، d تعداد ویژگی ها و m تعداد نقاط لنگر هستند.
کلیدواژه	داده کاوی، پیش پردازش، انتخاب ویژگی، روش غیرنظارتی، گراف
آدرس	دانشگاه تربیت دبیر شهید رجایی, دانشکده مهندسی کامپیوتر, ایران, دانشگاه تربیت دبیر شهید رجایی, دانشکده مهندسی کامپیوتر, ایران
پست الکترونیکی	ndaneshpour@sru.ac.ir

scalable unsupervised feature selection via matrix learning and bipartite graph theory

Authors	salehnezhad kosar ,daneshpour negin
Abstract	with the rapid spread of technology, large volumes of unlabeled data with large dimensions needed to be processed. to reduce the dimensions, unsupervised feature selection is known as an important pre-step before machine learning tasks. in this paper, an unsupervised feature selection method is proposed. the method works dynamically and is scalable based on matrix graphs and weighted matrices. to improve the performance of this method, instead of using the lagrange function to construct a weight matrix, a bipartite graph theory is applied. feature selection is done on the matrix graph. this graph is constructed using k nearest neighbors, which makes the method more robust to noise. the global structure of the original data is also preserved by constructing a reconstruction weight matrix with low-rank constraint. in addition, the feature score, which explicitly reflects the strength of the features, is modeled using the frobenius norm function. the proposed method is compared with similar methods in three criteria of classification accuracy, parameter sensitivity and complexity. experiments show that the classification accuracy of the method presented in this paper has improved by an average of 2.83%. its complexity has also been reduced to max{o(n2d),o(nm)}, where n is the number of samples, d is the number of features and m is the number of anchor points.
Keywords	data mining ,preprocessing ,feature selection ,unsupervised method ,graph