|
|
a statistical approach to knowledge discovery: bootstrap analysis of language models for knowledge base population from unstructured text
|
|
|
|
|
نویسنده
|
momtazi s. ,moradiannasab o.
|
منبع
|
scientia iranica - 2019 - دوره : 26 - شماره : 1-Special Issue - صفحه:26 -39
|
چکیده
|
In this paper, we propose a novel approach for knowledge discovery from textual data. the generated knowledge base can be used as one of the main components in the cognitive process of question answering systems. the proposed model automatically extract relations between named enti ties in persian. our proposed model is a bootstrapping approach based on ngram model to nd the representative textual patterns of relations as ngrams in order to extract new knowledge about given named entities. the main motivation for this work is the characteristic of the sentence structure in persian which, in contrary to english sentences, is in subject objectverb format. the proposed approach is a purely statistical one and no background knowledge of the target language is required. this makes our method applicable to any open domain relation extraction task. how ever, as for our testbed, we focus on the domain of biographical data of international poets and scientists to build a knowledge base about them. qualitative evaluations based on human assessment is an evidence for the ecacy of our method.
|
کلیدواژه
|
computational linguistics ,information extraction ,statistical language modeling ,n-gram model ,relation extraction ,textual pattern acquisition
|
آدرس
|
amirkabir university of technology, department of computer engineering and information technology, iran, saarland university, department of computational linguistics and phonetics, germany
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Authors
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|