>
Fa   |   Ar   |   En
   comparing the performance of xgboost, gradient boosting and gblup models under different genomic prediction scenarios  
   
نویسنده ghafouri-kesbi farhad
منبع journal of livestock science and technologies - 2024 - دوره : 12 - شماره : 1 - صفحه:31 -37
چکیده    Abstractthe aim of this study was to study the performance of xgboost algorithm in genomic evaluation of complex traits as an alternative for gradient boosting algorithm (gbm). to this end, genotypic matrices containing genotypic information for, respectively, 5,000 (s1), 10,000 (s2) and 50,000 (s3) single nucleotide polymorphisms (snp) for 1000 individuals was simulated. beside xgboost and gbm, the gblup which is known as an efficient algorithm in terms of accuracy, computing time and memory requirement was also used to predict genomic breeding values. xgboost, gbm and gblup were run in r software using xgboost, gbm and synbreed packages. all the analyses were done using a machine equipped with a core i7-6800k cpu which had 6 physical cores. in addition, 32 gigabyte of memory was installed on the machine. the person’s correlation between predicted and true breeding values (rp,t) and the mean squared error (mse) of prediction were computed to compare predictive performance of different methods. while gblup was the most efficient user of memory, gbm required a considerably high amount of memory to run. by increasing size of data from s1 to s3, gbm went out from the competition mainly due to its high demand for memory. parallel computing with xgboost reduced running time by %99 compared to gbm. the speedup ratios (the ratio of the gbm runtime to the time taken by the parallel computing by xgboost) were 444 and 554 for the s1 and s2 scenarios, respectively. in addition, parallelization efficiency (speed up ratio/number of cores) were, respectively, 74 and 92 for the s1 and s2 scenarios, indicating that by increasing the size of data, the efficiency of parallel computing increased. the xgboost was considerably faster than gblup in all the scenarios studied. accuracy of genomic breeding values predicted by xgboost was similar to those predicted by gbm. while the accuracy of prediction in terms of rp,t was higher for gblup, the mse of prediction was lower for xgboost, specially for larger datasets. our results showed that xgboost could be an efficient alternative for gbm as it had the same accuracy of prediction, was extremely fast and needed significantly lower memory requirement to predict the genomic breeding values.
کلیدواژه genomic evaluation ,parallel computing ,computing time ,snp
آدرس bu-ali sina university, faculty of agriculture, department of animal science, iran
پست الکترونیکی farhad_ghy@yahoo.com
 
     
   
Authors
  
 
 

Copyright 2023
Islamic World Science Citation Center
All Rights Reserved