انتخاب ویژگی با استفاده از الگوریتم بادبان ماهی و طبقه بندی داده ها با ماشین بردار پشتیبان

Fa | Ar | En

انتخاب ویژگی با استفاده از الگوریتم بادبان ماهی و طبقه بندی داده ها با ماشین بردار پشتیبان


نویسنده	زندی آرش ,شادروان سوده
منبع	اولين رويداد و همايش ملي علوم و فناوري هاي همگرا و فناوري هاي كوانتومي - 1403 - دوره : 1 - اولین رویداد و همایش ملی علوم و فناوری های همگرا و فناوری های کوانتومی - کد همایش: 03230-85168 - صفحه:0 -0
چکیده	در این مقاله، روشی جهت انتخاب ویژگی با استفاده از الگوریتم بادبان ماهی و طبقه بندی داده ها با ماشین بردار پشتیبان ارائه شده است. روش پیشنهادی روی مجموعه داده های مختلف، پیاده سازی و مورد بررسی قرار گرفته است. همه طبقه بندها در ابعاد پایین نتایج قابل قبولی ارائه می‌کنند درحالی‌که در ابعاد بالا با مشکل بدی بعد روبرو هستند. بنابراین انتخاب ویژگی می‌تواند تاثیر قابل‌توجهی بر نرخ بازشناسی درست الگوریتم طبقه بند، داشته باشد. هدف اصلی از انتخاب ویژگی،کاهش بعد بردار ویژگی در طبقه‌بندی است به‌طوری‌که نرخ طبقه‌بندی قابل قبولی نیز حاصل شود. نتایج بیانگر این است که استفاده از الگوریتم بادبان ماهی سبب دستیابی به راه حلی با تعداد ویژگی و نرخ خطای کمتر نسبت به روشهای دیگر میشود. حداقل نرخ خطای طبقه بندی با بکارگیری روش پیشنهادی برابر 5 درصد است. این در حالی است که بدون استفاده از الگوریتم بادبان ماهی، حداقل میزان خطای طبقه بندی، 1/10 درصد و بدون استفاده از روش طبقه بندی ماشین بردار پشتیبان، 9/17 درصد می باشد. همچنین، روش پیشنهادی قادر است بخش غیر نرمال و سرطانی سلولها را شناسایی کند. علاوه براین، بررسی خطای طبقه بندی مربوط به سرطان خون، سرطان پروستات و سرطان ریه با تغییر پارامتر θ ،بیانگر این است که تغییر پارامتر θ ،تاثیری در خطای طبقه بندی ندارد اما با افزایش ویژگی ها، به طور کلی، خطای طبقه بندی، دارای یک روند نزولی خواهد بود. همچنین، با کاهش پا رامتر e، تعداد تکرارهای الگوریتم و در نتیجه، زمان محاسبات کاهش می یابد. اما، مقادیر اجزای تابع هدف برای سرطانهای مختلف با کاهش پارامتر θ ،افزایش پیدا میکند. بررسی عملکرد محاسباتی برای مجموعه داده های مختلف نیز نشان میدهد که خطای طبقه بندی حتی زمانی که θ افزایش می یابد؛ ثابت باقی میماند.
کلیدواژه	انتخاب ویژگی، الگوریتم بادبان ماهی، طبقه بندی داده ها، روش ماشین بردار پشتیبان، عملکرد محاسباتی
آدرس	, iran, , iran
پست الکترونیکی	shadravan239@gmail.com

feature selection using the sailfish algorithm and data classification with support vector machine

Authors
Abstract	in this article, a method for feature selection using the fish sail algorithm and data classification with support vector machine is presented. all the classifiers provide acceptable results in low dimensions while they face bad dimension problem in high dimensions. therefore, feature selection can have a significant effect on the correct recognition rate of the classification algorithm. the main goal of feature selection is to reduce the dimension of the feature vector in classification so that an acceptable classification rate is obtained. the results show that the use of the fish sail algorithm leads to a solution with a lower number of features and a lower error rate than other methods. the minimum classification error rate using the proposed method is equal to 5%. this is despite the fact that without using the fish sail algorithm, the minimum classification error is 10.1% and without using the support vector machine classification method, it is 17.9%. also, peisheh eddy s method is able to identify the abnormal and cancerous part of the cells. in addition, the examination of the classification error related to blood cancer, prostate cancer and lung cancer by changing the θ parameter shows that changing the θ parameter has no effect on the classification error, but with the increase of features, in general, the classification error has it will be a downward trend. also, by decreasing the parameter e, the number of algorithm repetitions and, as a result, the calculation time decreases. however, the values of the components of the objective function for different cancers increase with the decrease of the θ parameter. examining the computational performance for different data sets also shows that the classification error increases even when θ increases; remains constant.
Keywords	feature selection ,sailfish algorithm ,data classification ,support vector machine method ,computational performance