|
|
feature engineering in persian dependency parser
|
|
|
|
|
نویسنده
|
lazemi s. ,ebrahimpour-komleh h.
|
منبع
|
journal of ai and data mining - 2019 - دوره : 7 - شماره : 3 - صفحه:467 -474
|
چکیده
|
Dependency parser is one of the most important fundamental tools in the natural language processing, which extracts structure of sentences and determines the relations between words based on the dependency grammar. the dependency parser is proper for free order languages, such as persian. in this paper, datadriven dependency parser has been developed with the help of phrasestructure parser for persian. the defined feature space in each parser is one of the important factors in its success. our goal is to generate and extract appropriate features to dependency parsing of persian sentences. to achieve this goal, new semantic and syntactic features have been defined and added to the mstparser by stacking method. semantic features are obtained by using word clustering algorithms based on syntagmatic analysis and syntactic features are obtained by using the persian phrasestructure parser and have been used as bitstring. experiments have been done on the persian dependency treebank (perdt) and the uppsala persian dependency treebank (updt). the results indicate that the definition of new features improves the performance of the dependency parser for the persian. the achieved unlabeled attachment score for perdt and updt are 89.17% and 88.96% respectively.
|
کلیدواژه
|
dependency parser ,phrasestructure parser ,mstparser ,stacking ,persian.
|
آدرس
|
university of, university of kashan, department of computer eng., iran, university of, university of kashan, department of computer eng., iran
|
پست الکترونیکی
|
ebrahimpour@kashanu.ac.ir
|
|
|
|
|
|
|
|
|
|
|
|
Authors
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|