|
|
enhanced data point importance for variable selection multivariate calibration models
|
|
|
|
|
نویسنده
|
vali zade somaye ,abdollahi hamid
|
منبع
|
نهمين سمينار ملي دوسالانه كمومتريكس ايران - 1402 - دوره : 9 - نهمین سمينار ملی دوسالانه کمومتريکس ايران - کد همایش: 02230-81220 - صفحه:0 -0
|
چکیده
|
Variable selection techniques are a critical step to obtaining a good prediction performance and explaining the underlying phenomena in multivariate calibration methods. we introduced a novel utilization of data point importance (dpi), specifically designed for variable selection in the context of multiblock data (pls). as previously stated, [1], within the dpi framework, author’s focus revolved around establishing and arranging the significance of individual data points to preserve the underlying data microstructure. this involved organizing the sequence of more crucial variables in a manner where the foremost variable exerts the greatest influence on preserving the data microstructure. furthermore, a novel adaptation of the dpi has been developed and implemented here for variable selection. in this modified approach, termed enhanced dpi, not only have the essential points been prioritized systematically, but the concept of dpi has been extended to encompass all data points. the data points are positioned based on the data structure across different layers, where the significance of these layers also influences the sequencing of data point importance. in this proposed work, our suggestion involves carrying out the identification of dpi and their subsequent arrangement, contingent upon changes in the inner polygon s relative area, within the realm of the pls space. the latent variables in pls are formed through linear combinations of observed variables, yet the principle guiding weight selection is rooted in maximizing the covariance between the latent variable(s) representing the x data and the y data (including distinct attributes like concentrations of an active compound). this selection criterion aligns with the method s overarching objective: modeling y as a function of x. consequently, pls involves defining basis vectors within the row and column spaces, with the rows and columns representing points distributed within their respective spaces. pls loadings serve to delineate the inner polygon of these row and column spaces, facilitating the arrangement and sorting of data rows and columns based on the pls space. the proposed method has been used to sort the importance and selection of the variables in spectroscopic multivariate pls calibration method for a three-component simulated and real data. the proposed procedure was compared to a variable importance in projection (vip) method as a well-known variable selection method.
|
|
|
آدرس
|
, iran, , iran
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Authors
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|