طراحی الگوریتم مبتنی بر داده‌کاوی به‌منظور پیش‌بینی دیابت

Fa | Ar | En

طراحی الگوریتم مبتنی بر داده‌کاوی به‌منظور پیش‌بینی دیابت


نویسنده	رفیعی نوید
منبع	ديابت و متابوليسم ايران - 1402 - دوره : 23 - شماره : 1 - صفحه:53 -67
چکیده	مقدمه: دیابت سالانه باعث مرگ‌ومیر فراوانی می‌شود و تعداد افراد زیادی که به این بیماری مبتلا هستند به اندازه‌ی کافی وضعیت سلامت خود را درک نمی‌کنند. این مطالعه یک مدل مبتنی بر داده‌کاوی به‌منظور تشخیص و پیش‌بینی زودهنگام دیابت پیشنهاد می‌کند. روش‌ها: با وجود اینکه تکنیک کا-میانه ساده است و می‌توان آن را برای طیف گسترده‌ای از انواع داده‌ها استفاده کرد، اما نسبت به موقعیت‌های اولیه مراکز خوشه که نتیجه‌ی نهایی خوشه را تعیین می‌کنند بسیار حساس است، به‌طوری‌که یا یک مجموعه داده‌ی خوشه‌بندی شده مناسب و کارا را برای مدل رگرسیون لجستیک فراهم می‌کند و یا مقدار کمتری داده را در نتیجه‌ی خوشه‌بندی ناصحیح مجموعه داده‌ی اصلی ارائه می‌دهد. از این‌رو، عملکرد مدل رگرسیون لجستیک را محدود می‌کند. هدف اصلی این مقاله تعیین راه‌های بهبود خوشه‌بندی کا-میانه و نتیجه‌ی دقت رگرسیون لجستیک است. از این‌رو، الگوریتم پیشنهادی شامل تکنیک‌های تحلیل مولفه‌های اصلی، کا-میانه و مدل رگرسیون لجستیک است.یافته‌ها: نتایج به‌دست‌آمده از این مطالعه نشان می‌دهد که توانایی به‌دست آوردن نتیجه‌ دقت خوشه‌بندی کا-میانه بسیار بالاتر از آن چیزی است که سایر محققان در مطالعات مشابه به‌دست آورده‌اند. همچنین در مقایسه با نتایج به‌دست‌آمده از سایر الگوریتم‌ها، مدل رگرسیون لجستیک در سطح بهبود یافته‌ای در پیش‌بینی شروع دیابت اجرا شد. مزیت واقعی دیگر این است که الگوریتم پیشنهادی توانست با موفقیت یک مجموعه داده‌ی جدید را مدل کند.نتیجه‌گیری: به‌طور کلی، رویکرد پیشنهادی می‌تواند به شکل تاثیرگذاری در پیش‌بینی و تشخیص زودهنگام دیابت استفاده شود.
کلیدواژه	دیابت، پیش‌بینی، تحلیل مولفه‌های اصلی، کا-میانه، رگرسیون لجستیک
آدرس	دانشگاه آزاد اسلامی واحد بندرعباس, گروه مهندسی صنایع, ایران
پست الکترونیکی	n.rafiei@iau-tnb.ac.ir

design an algorithm based on data mining to predict diabetes

Authors	rafiei navid
Abstract	background: diabetes entails a great quantity of deaths each year and a great quantity of people living with the disease do not find out their health status early sufficient. in this paper, we advance a data mining-based model for prematurely diagnosis and prediction of diabetes.methods: although k-means is simple and can be utilized for a vast diversity of data kinds, it is wholly sensitive to initial locations of cluster centers which specify the final cluster result, which either enables an efficiently and adequate clustered dataset for the logistic regression model, or presents a lesser amount of data as a result of wrong clustering of the main dataset, thereby restricting the proficiency of the logistic regression model. the main purpose of this study is was to specify procedures of ameliorating the k-means clustering and logistic regression accuracy consequence. therefore, our algorithm comprises of principal component analysis technique, k-means technique and logistic regression model.results: the results obtained from this study show that the ability to obtain the result of k-means clustering accuracy is much higher than what other researchers have obtained in similar studies. also, compared to the results obtained from other algorithms, the logistic regression model was implemented at an improved level in predicting the onset of diabetes. another real advantage is that the proposed algorithm was able to successfully model a new dataset.conclusion: in general, the proposed approach can be effectively used in predicting and early diagnosis of diabetes.
Keywords	diabetes ,prediction ,principal component analysis ,k-means ,logistic regression