|
|
|
|
prediction of ovarian cancer in holstein cattle using machine learning and microarray data
|
|
|
|
|
|
|
|
نویسنده
|
ghaed-rahmati mostafa ,ghafouri-kesbi farhad ,ahmadi ahmad
|
|
منبع
|
journal of livestock science and technologies - 2026 - دوره : 14 - شماره : 1 - صفحه:57 -65
|
|
چکیده
|
Abstractthe aim was to the network visualization of genes involved in ovarian cancer in holstein cattle and assess the performance of machine learning (ml) methods for predicting ovarian cancer using gene expression microarray data. gene expression data with accession number gse225981for healthy and cancer ovarian stromal cells in holstein cows were obtained from the geo database. differentially expressed genes (up and down-regulated genes, degs) were identified with online web tool geo2r. after identifying degs and genes associated with ovarian cancer, the cytoscape software was used to visualize the gene network. decision tree (dt), random forest (rf) and support vector machine (svm) were used to predict the phenotype (healthy or cancer) from the microarray data. the variable importance feature of rf applying the gini index was used to select and rank the most important genes in the network. selected genes were then evaluated to determine their contribution in cancer-related pathways. there were 603 differentially expressed genes (degs) of which 327 were up-regulated and 276 were down-regulated. except for the scenario of 2 samples in training data and 4 samples in test data in which the accuracy of dt was 75%, in other scenarios, the ml methods predicted the phenotypes (healthy or cancer) with the accuracy of 100%. the genes gpr65, rhbdf2, tbc1d30, dsg2, h2ac17, aff3, agmo, aurka, ca3 and ca9 were selected by rf as promising potential markers for diagnosis and prediction of ovarian cancer. a literature survey showed the involvement of these genes in the process and cancerous pathways. in conclusion, the studied ml methods were recommended for analyzing microarray data as showed significant performance in predicting ovarian cancer in holstein cattle. also, the variable importance feature of rf can be part of any study on microarray data for identifying important genes, those which are highly correlated with the disease in question.
|
|
کلیدواژه
|
machine learning ,microarray ,gene ,cancer ,gini index
|
|
آدرس
|
bu-ali sina university, faculty of agriculture, department of animal science, iran, bu-ali sina university, faculty of agriculture, department of animal science, iran, bu-ali sina university, faculty of agriculture, department of animal science, iran
|
|
پست الکترونیکی
|
ahahmady@gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Authors
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|