>
Fa   |   Ar   |   En
   ارزیابی کارائی روش‌های کاهش پارامترها در بهبود دقت مدل‌سازی شاخص کیفی آب در رودخانه‌ قزل اوزن با استفاده از الگوریتم‌های یادگیری ماشین  
   
نویسنده ستاری محمدتقی ,شیرینی کیمیا ,جاویدان سحر
منبع مدل سازي و مديريت آب و خاك - 1403 - دوره : 4 - شماره : 2 - صفحه:89 -104
چکیده    آگاهی از کیفیت آب یکی از نیازهای مهم در برنامه‌ریزی، توسعه و حفاظت از منابع آب به‌شمار می‌رود. تعیین کیفیت آب برای مصارف مختلف از جمله آبیاری و شرب در مناطق مختلف ضروری است. استفاده از روش‌های مدرن داده‌کاوی، می‌توانند رویکرد مناسبی برای پیش‌بینی و طبقه‌بندی کیفیت آب ارائه دهند. در پژوهش حاضر کیفیت آب رودخانه‌ قزل ‌اوزن در ایستگاه‌ قره‌گونئی روستایی از توابع بخش حلب شهرستان ایجرود در استان زنجان مورد ارزیابی قرار گرفت. در این راستا شاخص کیفی آب شرب (wqi) با استفاده از پارامترهای شیمیایی سختی کل، قلیائیت (ph)، هدایت الکتریکی، کل مواد جامد محلول، کلسیم، سدیم، منیزیم، پتاسیم، کلر، کربنات، بی‌کربنات و سولفات در دوره‌ آماری 21ساله (1398-1378) محاسبه شد. با توجه ‌به تعداد نسبتاً زیاد پارامترها از روش‌های تحلیل ‌مولفه‌های اصلی و تحلیل ‌مولفه‌های مستقل برای کاهش ابعاد استفاده شد. سپس از الگوریتم‌های مختلف یادگیری ماشین شامل درخت تصمیم، رگرسیون لجستیک و شبکه‌ عصبی مصنوعی پرسپترون چندلایه برای مدل‌سازی شاخص کیفی آب استفاده شد. با استفاده از این روش‌ها تعداد پارامترهای مورد نیاز برای محاسبه‌ شاخص کیفی از 12 به دو کاهش یافت. کاهش ابعاد داده‌ها باعث صرفه‌جویی در زمان نمونه‌برداری، پایش نمونه‌ها و تعیین کیفیت آب شده و هزینه‌های مورد نیاز برای مدل‌سازی را به مقدار قابل‌توجهی کاهش می‌دهد. نتایج نشان داد از بین روش‌های کاهش بعد روش تحلیل ‌مولفه‌های اصلی نسبت به روش تحلیل ‌مولفه‌های مستقل کارایی بهتری می‌تواند داشته باشد. هم‌چنین، نتایج نشان داد که از بین روش‌های مورد استفاده در مدل‌سازی، روش شبکه‌ عصبی پرسپترون چندلایه با استفاده از تحلیل ‌مولفه‌های اصلی با ضریب تبیین 99/0، جذر میانگین مربعات خطا برابر 79/44 و ضریب ویلموت اصلاح شده برابر 99/0 بهترین عملکرد را داشته است. با توجه ‌به این‌که ابعاد زیاد داده در بررسی و مدل‌سازی کیفیت آب باعث پیچیدگی و زمان بر بودن فرآیند مدل‌سازی می‌‌شود، لذا توصیه می‌‌شود از روش‌های کاهش بعد مانند تحلیل مولفه‌های اصلی برای کاهش ابعاد داده استفاده شود. نتایج حاصل از بررسی‌ها برتری روش تحلیل مولفه‌های اصلی نسبت به روش تحلیل مولفه‌های مستقل را نشان می‌دهد.
کلیدواژه شاخص کیفی آب، کاهش ‌ابعاد، الگوریتم‌های یادگیری ماشین، تحلیل ‌مولفه‌های اصلی، تحلیل ‌مولفه‌های مستقل
آدرس دانشگاه تبریز, دانشکده کشاورزی, گروه علوم و مهندسی آب, ایران, دانشگاه تبریز, دانشکده مهندسی برق و کامپیوتر, گروه مهندسی کامپیوتر, ایران, دانشگاه تبریز, دانشکده کشاورزی, گروه علوم و مهندسی آب, ایران
پست الکترونیکی javidansahar77115@gmail.com
 
   evaluating the efficiency of dimensionality reduction methods in improving the accuracy of water quality index modeling in qizil-uzen river using machine learning algorithms  
   
Authors sattari mohammad taghi ,shirini kimia ,javidan sahar
Abstract    introduction water quality assessment is paramount for various sectors, including environmental planning, public health, and industrial operations. with the increasing importance of ensuring safe water sources, especially for drinking and irrigation purposes, modern methodologies like data mining offer valuable tools for predictive analysis and classification of water quality. knowledge of water quality is considered one of the most important needs in planning, developing, and protecting water resources. determining the quality of water for different uses, including irrigation and drinking in different areas of life. the use of modern data mining methods can be beneficial for predicting and classifying the quality of provider water. in the current study, the water quality of the qizil-uzen river was evaluated at qara gunei stations. in this regard, the drinking water quality index (wqi) using the chemical compounds of glass hardness, alkalinity (ph), electrical conductivity, total dissolved substances, calcium, sodium, magnesium, potassium, chlorine, carbonate, bicarbonate and sulfate in the statistical period of 21 years (2000-2020) was estimated. water quality assessment is paramount for various sectors, including environmental planning, public health, and industrial operations. with the increasing importance of ensuring safe water sources, especially for drinking and irrigation purposes, modern methodologies like data mining offer valuable tools for predictive analysis and classification of water quality. materials and methodsdue to the relatively large number of variables, principal component analysis and independent component analysis methods were used to reduce dimensions, and then different machine learning algorithms including decision tree, logistic regression, and multi-layer perceptron artificial neural network were used to model the water quality index. by using these methods, the number of parameters needed to calculate the quality index was reduced from 12 to 2. reducing the dimensions of the data saves the time of sampling, monitoring the samples, and determining the quality of the water and reduces the costs required for modeling to a significant amount. the results showed that among the dimensionality reduction methods, the principal component analysis method can perform better than the independent component analysis method. in the current research, the wqi index was modeled using machine learning algorithms including decision tree, logistic regression, and artificial neural network method. the quality of water in the qizil-uzen qara gunei river station has been evaluated. then, to estimate the numerical values of the wqi index, th, ph, ec, tds, ca, na, mg, k, cl, co3, hco3, and so4 parameters of the mentioned station in the statistical period of 21 years (1378-1398) were used. pca and ica methods have been used to select different input parameters. modeling has been done in a python programming environment. among the available samples, 75% are considered for training and 25% for testing. results and discussionin the present research, to model the water quality index in the first stage, different dimensionality reduction methods such as pca and ica were used to reduce the time and cost of implementation. in the second stage, machine learning methods such as decision tree, linear regression, and multilayer perceptron were used. in the method used by tripathi and his colleagues, by using the principal component analysis method, they reduced the number of parameters needed to calculate the quality index from 28 to 9 and calculated the water quality index with the number of 9 parameters. examining the two methods of pca and ica has reduced the dimensions of the problem from 12 dimensions to 2 dimensions. the results show that the pca method can help us improve performance with little cost and high accuracy. because of the pca dimensions. the comparison of the results of the models was done using different numerical and graphical evaluation criteria, including r2, rmse, and modified wilmot coefficient as numerical criteria and taylor diagram as graphical criteria. because the pca algorithm can help reduce noise in data, feature selection, and generate independent and unrelated features from data. the results show that multi-layer perceptron, decision tree, and logistic regression methods accurately perform the water quality index. in this research, for the first time, using the ica dimension reduction algorithm, while reducing the dimensions of the problem, the water quality index is predicted with an accuracy of over 90%. conclusionwater quality index modeling holds significant relevance in agricultural practices, where access to clean water is crucial for irrigation and crop growth. surprisingly, only a limited number of studies have explored variable reduction methods in water quality index modeling, with none incorporating the relatively novel independent component analysis (ica) method for dimensionality reduction. thus, the current research fills this gap by employing pca and ica techniques to reduce the dimensionality of large datasets in water quality index modeling. by utilizing these advanced methods, the study aims to enhance efficiency and accuracy in assessing water quality, thereby offering valuable insights for agricultural water management. following dimensionality reduction, the dataset is then subjected to modeling using various machine learning algorithms. this approach not only optimizes computational resources but also facilitates a deeper understanding of the complex interrelationships among water quality parameters. through this pioneering research endeavor, the efficacy of ica alongside pca in addressing water quality index modeling challenges is evaluated. by integrating these techniques with machine learning methodologies, the study endeavors to provide actionable intelligence for agricultural stakeholders, aiding in informed decision-making and resource allocation. moreover, by venturing into unexplored territory with the inclusion of ica, the research contributes to expanding the methodological toolkit available for water quality assessment. as agriculture faces increasing pressure from climate change and resource scarcity, such innovative approaches hold promise in ensuring sustainable water management practices.
Keywords dimensionality reduction ,independent component analysis ,machine learning algorithms ,principalcomponent analysis ,water quality index
 
 

Copyright 2023
Islamic World Science Citation Center
All Rights Reserved