|
|
topic detection for the persian language news using signature words and support vector machines
|
|
|
|
|
نویسنده
|
khayat mirza rasouli salar ,babaei giglou hamed ,razmara jafar
|
منبع
|
اجلاس فناوري رسانه - 1398 - دوره : 16 - اجلاس فناوری رسانه - کد همایش: 98190-87963 - صفحه:0 -0
|
چکیده
|
Topic detection systems which automatically determine the main topics of the news are important field of research in metadata-based approaches in machine learning and natural language processing. the main goal of this work is to build a system that learns previously extracted topics from the news text by making a relationship between the extracted features of the news text and their main topic. we hypothesize that word-based features like signature words in the documents may lead us to extract valuable information about the news topics. we proposed a word-based tf-idf representation with ignoring less valuable words in the documents with the linear svm classifier to identify the main topics of the news. experimental results presented in this paper show that the tf-idf representation with linear svm classification of the documents is very promising in identifying the main topic of the news.
|
کلیدواژه
|
news analysis ,linear svm ,one-vs-rest ,multi-class classification
|
آدرس
|
, iran, , iran, , iran
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Authors
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|