|
|
Minimum redundancy and maximum relevance for single and multi-document Arabic text summarization
|
|
|
|
|
نویسنده
|
Oufaida Houda ,Nouali Omar ,Blache Philippe
|
منبع
|
journal of king saud university - computer and information sciences - 2014 - دوره : 26 - شماره : 4 - صفحه:450 -461
|
چکیده
|
Automatic text summarization aims to produce summaries for one or more texts using machine techniques. in this paper, we propose a novel statistical summarization system for arabic texts. our system uses a clustering algorithm and an adapted discriminant analysis method: mrmr (minimum redundancy and maximum relevance) to score terms. through mrmr analysis, terms are ranked according to their discriminant and coverage power. second, we propose a novel sentence extraction algorithm which selects sentences with top ranked terms and maximum diversity.our system uses minimal language-dependant processing: sentence splitting, tokenization and root extraction. experimental results on easc and tac 2011 multilingual datasets showed that our proposed approach is competitive to the state of the art systems.
|
کلیدواژه
|
Arabic text summarization;Sentence extraction;mRMR;Minimum redundancy;Maximum relevance
|
آدرس
|
Ecole Nationale Supe´rieure d’Informatique (ESI), Algeria, Research Center on Scientific and Technical Information (CERIST), Algeria, Aix Marseille Universite, Centre national de la recherche scientifique (CNRS), France
|
پست الکترونیکی
|
blache@lpl-aix.fr
|
|
|
|
|
|
|
|
|
|
|
|
Authors
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|