|
|
alternating conditional expectation (ace) algorithm for robust regression analysis ofsimulated dataset in the presence of outliers
|
|
|
|
|
نویسنده
|
khanmohammadi khorrami mohammadreza ,mohammadi mahsa ,s.hajiseyedrazi zahra
|
منبع
|
نهمين سمينار ملي دوسالانه كمومتريكس ايران - 1402 - دوره : 9 - نهمین سمينار ملی دوسالانه کمومتريکس ايران - کد همایش: 02230-81220 - صفحه:0 -0
|
چکیده
|
The purpose of the univariate regression model is to discover the relationship between responseand descriptor variables, which means accurately calculating the regression coefficients. thepresence of random noise and outlier data is recognized as important sources of uncertainty inregression models. outliers are data points with high residual values compared to other points,leading to abnormalities in the regression process. this issue becomes more challenging when thedataset becomes smaller. two strategies have been taken into account to deal with datacontaminated with outliers. firstly, using outlier detection procedures (such as dixon s test,grubb s test, cook s squared distance, squared mahalanobis distance, etc.), and secondly, utilizingrobust methods (median-based techniques, jacobian matrix method, additivity and variancestabilization (avas), and alternating conditional expectation (ace)) [1-3].this study investigated the effect of outlier data on the performance of various regression models,such as simple least squares (sls), median-based methods like single median (sm), repeatedmedian (rm), and least median of squares (lms) methods, as well as the jacobian matrixtechnique, avas, and ace. six pairs of data points were defined as raw data, and the presence ofan outlier in the response variable was investigated.for the dataset with outliers, sls, as a representative of classical regression methods, had weakresults. also, the efficiency of median-based approaches was not good. the results of the jacobianmethod were not desirable, which may stem from not defining the initial equation for this model.the performance of avas was broadly satisfying, but ace was the best procedure. the finalstatistical results of ace were an r^2 of 1.000, r^2adj of 1.000, sum of squares (sse) of 7.32e-30, and the variance inflation factor (vif) with an infinity value. selecting the best transformationfunction with the highest correlation coefficient between the response and descriptor variables andstabilizing the error variance distribution are important advantages of ace that lead to the bestresults [4]
|
کلیدواژه
|
sls ,sm ,rm ,lms ,jacobian ,avas ,ace
|
آدرس
|
, iran, , iran, , iran
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Authors
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|