|
|
a new approach for text documents classification with invasive weed optimization and naive bayes classifier
|
|
|
|
|
نویسنده
|
khalandi saman ,soleimanian gharehchopogh farhad
|
منبع
|
journal of advances in computer engineering and technology - 2018 - دوره : 4 - شماره : 3 - صفحه:167 -184
|
چکیده
|
With the fast increase of the documents, using text document classification (tdc) methods has become a crucial matter. this paper presented a hybrid model of invasive weed optimization (iwo) and naive bayes (nb) classifier (iwo-nb) for feature selection (fs) in order to reduce the big size of features space in tdc. tdc includes different actions such as text processing, feature extraction, forming feature vectors, and final classification. in the presented model, the authors formed a feature vector for each document by means of weighting features use for iwo. then, documents are trained with nb classifier; then using the test, similar documents are classified together. fs do increase accuracy and decrease the calculation time. iwo-nb was performed on the datasets reuters-21578, webkb, and cade 12. in order to demonstrate the superiority of the proposed model in the fs, genetic algorithm (ga) and particle swarm optimization (pso) have been used as comparison models. results show that in fs the proposed model has a higher accuracy than nb and other models. in addition, comparing the proposed model with and without fs suggests that error rate has decreased.
|
کلیدواژه
|
text document classification ,invasive weed optimization ,naive bayes ,feature selection
|
آدرس
|
islamic azad university, urmia branch, department of computer engineering, iran, islamic azad university, urmia branch, department of computer engineering, iran
|
پست الکترونیکی
|
bonab.farhad@gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
Authors
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|