|
|
|
|
Phrase Alignments in Parallel Corpus Using Bootstrapping Approach
|
|
|
|
|
|
|
|
نویسنده
|
Tavakoli Leila ,Faili Heshaam
|
|
منبع
|
international journal of information and communication technology research - 2014 - دوره : 6 - شماره : 3 - صفحه:63 -76
|
|
چکیده
|
Word choice and word order problems arc considered as fundamental barriers in statistkal machine translation (smt). thlese barriers arc more pronounced in dcfidcndcs of tntining wrpus. phrase-based smt has advantages in word elwin and lontl word ordering prot:ess; so phrase alignment is effective in improving translation quality. in this paper, an approach for automatic alignment is proposcd in which boosts up the machine translation quality. since, alignment problem is more probl ematic with lack of training data, so we make corpus of phrase a lignment with high precision. for this purpose, a novel pharse alignment approach in a bootstrapping manner is proposed. by bootstrapping on alignment model via using a number of features, the accuracy of the phrase table is improved iteratively. these features arc based on the phmsc table extracted from moses, ibm model 3 a lignment probabilitities, google translator and fertility of candidates. our expcdmcnts on english-penian tn tn slation show an improvement about 4.17 bleu points over the pb-smt as baseline system.
|
|
کلیدواژه
|
Phrase-Based SMT: scarce traininJ? corpus ,loJ?-Iinear models ,Maximum Entropy ,Bootstrapping Approach ,Fertility
|
|
آدرس
|
university of tehran, School of Electrical and Computer Engineering,College of Engineering, ایران, university of tehran, School of Electrical and Computer Engineering,College of Engineering, ایران
|
|
پست الکترونیکی
|
hfaiii@ ut.ac. ir
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Authors
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|