ارزیابی دقت توصیفی عوارض در اطلاعات مکانی مردم‌گستر

Fa | Ar | En

ارزیابی دقت توصیفی عوارض در اطلاعات مکانی مردم‌گستر


نویسنده	واحدی طرقبه بهزاد ,آل شیخ علی اصغر
منبع	علوم و فنون نقشه برداري - 1394 - دوره : 5 - شماره : 3 - صفحه:49 -63
چکیده	از زمان پیدایش مفهوم اطلاعات مکانی مردم گستر (داوطلبانه)1 کیفیت این اطلاعات به عنوان بزرگترین مشکل آن معرفی شده است. بنابراین تا کنون تحقیقات مختلفی به بررسی کیفیت داده های مردمگستر پرداخته و سعی در برآورد کیفیت این اطلاعات داشته اند. اما در این تحقیقات به دقت توصیفی کمتر از سایر المانهای کیفیت پرداخته شده است؛ در حالیکه این المان در آنالیزهای گوناگون مکانی و کاربردهای مختلف اطلاعات مردم گستر از اهمیت بالایی برخوردار است. بنابراین در این تحقیق با استفاده از یک روش جدید و با استفاده از الگوریتم levenshtein به همراه پیش پردازش های متنی، دقت توصیفی عوارض مردمگستر (در قالب نام عارضه) با مقایسه آنها با عوارض مرجع مورد بررسی قرار میگیرد. برای محاسبه دقت توصیفی فرض میشود که بین عوارض مرجع و مردمگستر تناظریابی انجام شده است. منطقه مورد مطالعه این تحقیق شهر تهران است و از داده های تولیدی شهرداری تهران به عنوان مجموعه داده مرجع و از داده های سایت openstreetmap به عنوان مجموعه داده مردم گستر استفاده شده است. طبق نتایج حاصل، 33 درصد از عوارض مردم گستر دارای نام، نام صحیح، 44 درصد از آنها نام تقریباً صحیح و 23 درصد باقیمانده نام نادرست دارند و دقت توصیفی کل دادههای مردمگستر برابر 77 درصد میباشد.
کلیدواژه	اطلاعات مکانی مردم‌گستر، دقت توصیفی، الگوریتم levenshtein، کیفیت اطلاعات مکانی، تناظریابی، openstreetmap
آدرس	دانشگاه صنعتی خواجه نصیرالدین طوسی, دانشکده مهندسی نقشه برداری, ایران, دانشگاه صنعتی خواجه نصیرالدین طوسی, دانشکده مهندسی نقشه برداری, ایران
پست الکترونیکی	alesheikh@kntu.ac.ir

Assessing the Attribute Accuracy of Volunteered Geographic Information

Authors
Abstract	Since the emergence of the concept of Volunteered Geographic Information (VGI), the quality of this type of information is presented as its biggest problem. Therefore, this issue has been addressed frequently in the literature, and scientists have tried to evaluate the quality of VGI. However, attribute accuracy, despite its important role in a variety of spatial analyses and applications of VGI, has received less attention in comparison to other elements of quality in the literature. Positional accuracy, completeness, lineage, resolution, and time accuracy are among the most important elements of spatial data quality.In this study, using a novel method and by leveraging Levenshtein algorithm along with text preprocessing, attribute accuracy of volunteered geographic features is examined, comparing this data with reference data. Levenshtein algorithm calculates the difference between two strings of text by counting the number of changes (edits) necessary to change one word to another, and thus sometimes is referred to as Levenshtein distance.The first step of the proposed method is to find corresponding features in the two data sets to perform the comparison based on. This step is done by applying an automatic data matching algorithm between the two sets. This algorithm consists of five stages, each applied on either the reference data set or the VGI data set.After data matching is done, each VGI feature is compared with its corresponding match in the reference data set and the Levenshtein distance between the “name” attribute of these two features is calculated. Then, features are categorized as having correct (accurate), approximately correct, or incorrect names based on the Levenshtein distance and assuming that the name of the reference features are correct. For VGI features without a match in the reference data set, a search distance is defined, inside which reference features with the exact same name as the VGI feature are sought.The study area of this research is Tehran city, Iran. A data set produced by the municipality of Tehran is used as the reference data set and OpenStreetMap data as the VGI data set. According to the results, 47 percent of VGI features have a name attribute and among these, 33 percent of them have correct name, 44 percent have approximately correct name, and the remaining 23 percent have incorrect names. The Overall attribute accuracy of the VGI data set used in this study, is thus 77 percent, indicating that among those features that have a name attribute, 77 percent of them have either correct or approximately correct names. A future line of research, based on the findings of this paper, could be to develop methods for evaluating the attribute accuracy of a data set without having to compare it with a reference data set.
Keywords	OpenStreetMap