>
Fa   |   Ar   |   En
   Using an Evolving Thematic Clustering in a Text Segmentation Process  
   
نویسنده Lamprier Sylvain ,Amghar Tassadit ,Levrat Bernard ,Saubion Frederic
منبع journal of universal computer science - 2008 - دوره : 14 - شماره : 2 - صفحه:178 -192
چکیده    Abstract: the thematic text segmentation task consists in identifying the most im- portant thematic breaks in a document in order to cut it into homogeneous passages. we propose in this paper an algorithm for linear text segmentation on general corpuses. it relies on an initial clustering of the sentences of the text. this preliminary partition- ing provides a global view on the sentences relations existing in the text, considering the similarities in a group rather than individually. the method, so-called classstrug- gle, is based on the distribution of the occurrences of the members of each class. during the process, the clusters then evolve, by considering a notion of proximity and of layout in the text, in the aim to create groups that contain only sentences related to a same topic development. finally, boundaries are created between sentences belonging to two different classes. first experimental results are promising, classstruggle appears to be very competitive compared with existing methods.
کلیدواژه Text Segmentation ,Clustering
آدرس University of Angers, France, University of Angers, France, University of Angers, France, University of Angers, France
پست الکترونیکی amghar@info.univ-angers.fr
 
     
   
Authors
  
 
 

Copyright 2023
Islamic World Science Citation Center
All Rights Reserved