بررسی همبسته‌های آکوستیکی ریتم گفتار زبان فارسی با تمرکز بر تمایزات میان‌گوینده

Fa | Ar | En

بررسی همبسته‌های آکوستیکی ریتم گفتار زبان فارسی با تمرکز بر تمایزات میان‌گوینده


نویسنده	تقوی نفیسه ,مولودی امیرسعید ,ابوالحسنی زاده وحیده
منبع	پژوهش هاي زبان شناسي - 1399 - دوره : 12 - شماره : 2 - صفحه:27 -50
چکیده	یکی از ویژگی‌های ریتم گفتار تغییرپذیری دیرش سطوح مختلف فواصل آوایی، ازجمله فواصل واحدهای زبانی، فواصل همخوان و واکه، فواصل همخوانی و واکه‌ای، فواصل واک‌داری و بی‌واکی، فواصل هجایی و فواصل قلۀ هجا است. محاسبۀ تغییرپذیری دیرش برخی از این فواصل، ازجمله فواصل همخوان و واکه و فواصل همخوانی و واکه‌ای، طبقه‌بندی زبان‌ها را براساس ریتم آنها مشخص می‌کند. علاوه‌بر این، تجارب روزمرۀ ما نشان می‌دهد که در مواقعی شناسایی گوینده تنها از طریق صدای آن فرد امکان‌پذیر است. ویژگی‌های زنجیری و زبرزنجیری ریتم گفتار ازجمله مواردی هستند که در شناسایی گوینده می‌توان از آنها استفاده کرد. . در این مطالعه، همبسته‌های آکوستیکی ریتم گفتار زبان فارسی در یک متن خوانشی با محاسبۀ سنجه‌های مختلف دیرش فواصل ذکرشده بررسی می‌شوند. همچنین، به‌منظور یافتن بهترین سنجۀ ریتم شناسایی گوینده، تغییرپذیری ریتم میان‌گویندۀ داده‌های این پژوهش بررسی می‌شوند. نتایج به‌دست‌آمده از داده‌های این پژوهش، نظریۀ هجا‌زمانی بودن زبان فارسی را تایید می‌کند. علاوه‌بر این، نتایج بررسی ویژگی‌های زنجیری و زبرزنجیری ریتم داده‌های پژوهش حاضر حاکی از تغییرپذیری معنی‌دار میان‌گوینده گویشوران زبان فارسی است که از میان آنها سنجۀ مربوط به تغییرپذیری دوتایی فواصل همخوانیواکه‌ای (npvi_cv) و بعد از آن، سرعت هجا و درصد واکه‌ای بودن متغیرهای قوی‌تری برای نشان‌دادن تمایزات میان‌گوینده هستند.
کلیدواژه	ریتم گفتار، تغییرپذیری دیرش، همبسته‌های آکوستیک، میان‌گوینده، سنجه‌های ریتم
آدرس	دانشگاه شیراز, دانشکده ادبیات و علوم انسانی, بخش زبان‌های خارجی و زبانشناسی, ایران, دانشگاه شیراز, دانشکدۀ ادبیات و علوم انسانی, بخش زبان‌های خارجی و زبان‌شناسی, ایران, دانشگاه شهید باهنر کرمان, دانشکده ادبیات و علوم انسانی, بخش زبان های خارجی, ایران
پست الکترونیکی	vahidehabolhasani@yahoo.com

Acoustic Correlations of Speech Rhythms in Persian Based on Variability of Between-speakers Characteristics

Authors	Taghva Nafiseh ,Moloodi Amirsaeid ,abolhasanizadeh vahideh
Abstract	AbstractThe durational variability of phonetic intervals is considered as one of the properties of speech rhythm. These intervals include segmental, vowel, consonantal, vocalic, intervocalic, voiced, unvoiced, syllable, and syllable peak intervals. The durational variability measure for some of these intervals, such as vowel, consonantal, vocalic, intervocalic intervals, determines the classification of languages based on their rhythm. Besides, in some cases, the speaker identification is only possible through the person’s voice. The segmental and suprasegmental properties of a language can be used to identify the speaker. In this study, the acoustic correlations of Persian speech rhythm in a reading text are calculated by various durational measures. Also, the betweenspeaker rhythmic variability is considered to find the best rhythmic measures for Persian speaker identification. The results confirm that Persian is near to the syllablebased languages. Moreover, the results from segmental and suprasegmental consideration demonstrate significant betweenspeaker variability in Persian. Among phonetic intervals, nPVIVC and V% (percentage of vocalic intervals) best discriminate betweenspeaker variability in Persian.Keywords: Speech rhythm, Durational variability, Acoustic correlations, Betweenspeaker variability, Rhythmic measures IntroductionThe rhythmic properties of languages have been one of the controversial issues in linguistics in recent studies. Early studies on the classification of different rhythm types in language focused on the syllable and foot durations in which the speech rhythm was defined in terms of isochrony (Abercrombie, 1967; Lloyd James, 1940; Pike, 1945). They believed that Germanic languages had a simultaneous foot; that is why they were called &stresstimed& languages. It was also believed that Romance languages had similar syllables, so they were called &syllabletimed& languages.However, such approaches can be easily violated in spontaneous speech (Dauer, 1987). Dauer argued that languages with different rhythms also differ in syllable weight and vowel reduction. Stresstimed languages usually have a complex syllable structure and a higher rate of vowel reduction. Ramus, Nesper, and Mahler (1999) examined this hypothesis by measuring the standard deviation of vocalic (∆V) and consonantal intervals (∆C) as well as the percentage of vocalic intervals (%V) for each sentence. Then Grabe and Low (2002) introduced the pairwise variability index (PVI) to measure durational variability between sequences of vocalic and consonantal intervals (nPVIV and rPVIC). Besides, Dellwo (2010) proposed other normalization methods for the speech rate, including the coefficient of variation (Varco) and the natural logarithm. Arvantini (2012) introduced amplitude envelopebased rhythm measure based on which she investigated the repetition of acoustic information rather than segmental units.Another application of rhythm measures is in forensic sciences. As the speakers of a typical language have different voices, one of the aspects of forensic sciences is considering different voices between different speakers (Rose, 2004). Dellwo, Leeman, and Kolly (2015) cited three reasons for this diversity: the nature of the articulatory system, linguistic factors, and prosodic factors. Thus, we are faced with a variety of speakers’ voices, which is called betweenspeaker variability. Recently, evidence from various datasets suggested that measuring rhythm based on different phonetic intervals could vary significantly in a language as a function of speakers (Leeman, Kollyand Dellwo, 2014; Wiget et al., 2010; Yoon, 2010). Materials MethodsTen native speakers of contemporary standard Persian (5 men and 5 women) read a Persian text from the book &North Wind and the Sun& in the acoustic room at Shiraz University. The Persian version of this story contains seven complex sentences. Therefore, the dataset of this test comprised 70 tokens (10 speakers × 7 sentences).This research corpus was acoustically analyzed in Praat (v 6.1.09, in which six tiers of TextGrids were created. In the first tier, the offset and onset of each segment were determined manually and transcribed according to IPA. Then in the second tier, the vowels and consonants were tagged. In the third tier, the vowel and consonants intervals were labeled based on the number of consonants and vowels. In the fourth tier, the vocalic and consonantal intervals were determined. In the fifth layer, the boundary between the existing syllables was tagged manually. Finally, in the sixth tier, the peak of each syllable was automatically identified according to the principle of sonority by a script written by Dellwo[1]. Then, speech rhythm measures from previous works were used. All measures were automatically calculated using the existing script written by Dellwo.The mean and standard deviation of the results obtained from the scripts was calculated in SPSS (v 23) to classify the Persian language rhythm. Moreover, Pearson correlation and oneway ANOVA test were used to distinguish the most robust betweenspeaker measure. Discussion of Results and ConclusionsThe results confirm that Persian is near to the syllablebased languages. Besides, it was revealed that seven metrics are statistically significant (Speech rate (syl/s), VarcoC, %V, nPVIV, nPVIVC, ∆C(ln), ∆Peak(ln)). Based on the present study results, nPVIVC and V% are the most powerful measures to show the betweenspeakers variability in Persian. [1] https://www.cl.uzh.ch/de/people/team/phonetics/vdellw.html
Keywords