استخراج نشانه ها از اسناد مکانی در موتورهای جستجو

Fa | Ar | En

استخراج نشانه ها از اسناد مکانی در موتورهای جستجو


نویسنده	برهانی نژاد سعید ,حکیم پور فرشاد ,حمزه ئی احسان
منبع	علوم و فنون نقشه برداري - 1396 - دوره : 6 - شماره : 4 - صفحه:201 -216
چکیده	امروزه امکان دسترسی انتخابی به اطلاعات بر روی وب، از طریق موتورهای جستجو فراهم می شود. اما در مواردی که نیاز ما در برگیرنده جستجو در اطلاعات مکانی نیز باشد وظیفه جستجو پیچیده تر می شود و احتیاج به توانایی های خاصی در بخش جستجوگر است. هدف اصلی انجام این پژوهش ایجاد بستری جهت استخراج اطلاعات مکانی نهفته در اسناد مکانی و پیاده سازی و ارزیابی نگرش یکپارچه در بازیابی این اطلاعات می باشد. نگرش کلی در بازیابی اطلاعات مکانی به نحوی است که این اطلاعات از طریق ارتباطی که به اطلاعات غیر مکانی دارند استخراج می شوند، در حالی که در اسناد مکانی موجود اطلاعات مکانی و غیرمکانی به صورت یکپارچه ذخیره می گردند. در پژوهش های پیشین اسناد مکانی و اطلاعات موجود در آن ها کمتر مورد توجه قرار گرفته است. منظور از نگرش یکپارچه در بازیابی اطلاعات مکانی، استخراج اطلاعات مکانی و توصیفی موجود در اسناد مکانی به صورت یکپارچه و همزمان می باشد. اجزای تشکیل دهنده سیستم مبتنی بر پژوهش حاضر شامل خزنده، پایگاه داده و واسط کاربری می باشد. در بخش خزنده، اسناد مکانی کشف شده و متن این اسناد برای استخراج اطلاعات تجزیه می شود. پایگاه داده در این سیستم وظیفه ذخیره و شاخص گذاری اطلاعات استخراج شده توسط خزنده را برعهده دارد و در نهایت واسط کاربری تعامل بین سیستم و کاربر را فراهم می کند. این سیستم به صورت آزمایشی برروی یک کارساز کاربری به عنوان یک شبیه سازی از فضای وب پیاده سازی شده است. پژوهش پیش رو با پیاده سازی نگرش یکپارچه، اطلاعات مکانی را از اسناد مکانی بازیابی می کند و به این ترتیب گام موثری در بهبود کارایی موتورهای جستجوی مکانی برمی دارد.
کلیدواژه	وب مکانی، موتورهای جستجوی مکانی، اسناد مکانی، خزنده، جی ام ال gml
آدرس	دانشگاه تهران, دانشکده مهندسی نقشه برداری و اطلاعات مکانی،پردیس دانشکده های فنی, ایران, دانشگاه تهران, دانشکده مهندسی نقشه برداری و اطلاعات مکانی،پردیس دانشکده های فنی, ایران, دانشگاه تهران, دانشکده مهندسی نقشه برداری و اطلاعات مکانی،پردیس دانشکده های فنی, ایران
پست الکترونیکی	e.hamzei@ut.ac.ir

Tags Extraction from Spatial Documents by Search Engines

Authors	Borhaninejad S. ,Hakimpour F. ,Hamzei E.
Abstract	Nowadays the selective access to information on the Web is provided by search engines, but in the cases which the system includes spatial information the search task becomes more complex and requires special capabilities in the search engine system. The purpose of this study is to extract the information which lies in the GML documents also implementation and evaluation of this extracted information retrieval method in an integrated approach. Our proposed system consists of three components: crawler, database and user interface.1 Crawler: The main innovation of this study is this component. Crawler is a piece of software that after receiving the initial feed enters into Web pages and open links on each page and enters into the pages of these links. The crawler repeats this for new pages until all pages are reviewed and there are no new pages. The typical spatial search engines crawlers analyze and process the HTML documents and extract spatial information contained in these documents. In our proposed system, the crawler processes GML documents text instead of HTML documents, and extracts the spatial information from these documents. Crawler in this system has two main tasks: Detection of GML documents among the documents with different formats. Parsing of GML documents and extracting the spatial information 2Database: database has two major tasks in this system: Storing data which collected by crawlers Information indexing3User Interface: this section provides interaction between user and system and users send their queries to the system through this interfaceIn general, this system's search process is done in two phases: online and offline. Offline phase includes the crawler's searching and storing the information into the database. And the online phase includes user interface and ranking operation.All in all, in this study the following objectives discussed:1 Extraction of spatial information which is embedded in Web documents: Spatial documents include spatially explicit information such as the coordinates of the feature or the type of feature that extracting this information improves the response rate of spatial queries in search engines.2 Implementation and evaluation of an integrated spatial information retrieval approach. We have implemented this system as a pilot system on an Application Server as a simulation of Web. Our system as a spatial search engine provided searching capability throughout the GML documents and thus an important step to improve the efficiency of search engines has been taken.Despite the fact that today's engineers and specialists in many fields need raw spatial data and looking for it on the World Wide Web, most of spatial search engines are based on map representation and less attention is paid to spatial data. There is a substantial volume of spatial documents and information on the Web, however, the extent of the Web has caused this huge volume of documents and information hard to find among other information.Our proposed system as a spatial search engine provides the possibility of searching throughout the GML documents and thus it improves the efficiency of spatial search engines. Since GML documents include explicit spatial information along with nonspatial information, the main advantage of this system compared to other spatial search engines is an integrated approach to spatial and nonspatial data.
Keywords	Spatial Search Engine ,Spatial Documents ,Crawler ,GML