|
|
|
|
The BACON approach for rank-deficient data
|
|
|
|
|
|
|
|
نویسنده
|
kondylis a. ,hadi a.s. ,werner m.
|
|
منبع
|
pakistan journal of statistics and operation research - 2012 - دوره : 8 - شماره : 3 - صفحه:359 -379
|
|
چکیده
|
Rank-deficient data are not uncommon in practice. they result from highly collinear variables and/or highdimensional data. a special case of the latter occurs when the number of recorded variables exceeds the number of observations. the use of the bacon algorithm for outlier detection in multivariate data is extended here to include rank-deficient data. we present two approaches to identifying outliers in rankdeficient data based on the original bacon algorithm. the first algorithm projects the data onto a robust subspace of reduced dimension,while the second employs a ridge type regularization on the covariance matrix. both algorithms are tested on real as well as simulated data sets with good results in terms of their effectiveness in outlier detection. they are also examined in terms of computational efficiency and found to be very fast,with particularly good scaling properties for increasing dimension.
|
|
کلیدواژه
|
High-dimensional data; Mahalanobis distance; Outlier detection; Spatial median
|
|
آدرس
|
philip morris international acr, Switzerland, department of mathematics and actuarial science,the american university in cairo,egypt,department of statistical sciences, United States, department of statistics, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Authors
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|