ارزیابی خودکار جویش‌گرهای ویدئویی حوزه وب فارسی بر اساس تجمیع آرا

Fa | Ar | En

ارزیابی خودکار جویش‌گرهای ویدئویی حوزه وب فارسی بر اساس تجمیع آرا


نویسنده	یدالهی محمد مهدی ,زرگری فرزاد ,فرهودی مژگان
منبع	پردازش علائم و داده ها - 1397 - شماره : 3 - صفحه:3 -12
چکیده	امروزه رشد بسیار سریع اینترنت و نفوذ روزافزون آن در زندگی افراد باعث شده تا کاربران بسیاری برای رفع نیازهای روزمره خود به جویش گرها مراجعه کنند و این جویش گرها به توسعه و بهبود مستمر نیاز دارند. از این رو ارزیابی جویش گرها برای تعیین کارایی آنها اهمیت به سزایی دارد. در ایران نیز همانند سایر کشورها پژوهش های گسترده ای در زمینه ایجاد جویش گرهای خاص منظوره بومی انجام شده است. یکی از مهم ترین جویش گرهای خاص منظوره ایجاد شده، جویش گر ویدئویی است که وظیفه بازیابی ویدئوها از سطح وب را برعهده دارد. برای ارزیابی کیفیت این جویش گرها و بهبود مستمر آنها باید سطح خدمات دهی هر کدام از جویش گرها در مقایسه با دیگر جویش گرهای موجود مورد ارزیابی قرار گیرد. از آنجا که سرعت ارزیابی نقش مهمی در تعیین روند اصلاحات مورد نیاز دارد، بحث ارزیابی خودکار جویش گر ها بسیار پراهمیت خواهد شد. در این مقاله روشی مبتنی بر تجمیع آرا به منظور ارزیابی خودکار جویش گرهای ویدئویی ارائه شده است. تمرکز اصلی این روش بر روی حوزه وب فارسی بوده و با توسعه روشی نوین برای شباهت سنجی مبتنی بر محتوا براساس بردار های حرکت ویدئوها، سعی در ارزیابی جویش گرهای ویدئویی دارد. برای محک زدن روش معرفی شده، سازوکاری طراحی شد تا نتایج به دست آمده با نتایج حاصل از ارزیابی انسانی مورد مقایسه قرار گیرد. نتایج به دست آمده نشان دهنده میزان همبستگی بیش از 94% دو روش است که قابل اتکا بودن روش خودکار ارزیابی پیشنهادی را بیان می کند.
کلیدواژه	ارزیابی خودکار، جویش‌گرهای ویدئویی، وب فارسی
آدرس	مرکز تحقیقات مخابرات ایران, پژوهشکده فناوری اطلاعات, گروه سکوهای فناوری اطلاعات, ایران, مرکز تحقیقات مخابرات ایران, پژوهشکده فناوری اطلاعات, گروه سکوهای فناوری اطلاعات, ایران, مرکز تحقیقات مخابرات ایران, پژوهشکده فناوری اطلاعات, گروه سکوهای فناوری اطلاعات, ایران
پست الکترونیکی	farhood@itrc.ac.ir

Automatic Evaluation of Video Search engines in Persian Web domain based on Majority Voting

Authors	Yadolahi Mohammadmahdi ,Zargari Farzad ,Farhoodi Mojgan
Abstract	Today, the growth of the internet and its high influence in individuals rsquo; life have caused many users to solve their daily needs by search engines and hence, the search engines need to be modified and continuously improved. Therefore, evaluating search engines to determine their performance is of paramount importance. In Iran, as well as other countries, extensive researches are being performed on search engines. To evaluate the quality of search engines and continually improve their performance, it is necessary to evaluate search engines and compare them to other existing ones. Since the speed plays an important role in the assessment of the performance, automatic search engine evaluation methods attracted grate attention. In this paper, a method based on the majority voting is proposed to assess the video search engines. We introduced a mechanism to assess the automatic evaluation method by comparing its results with the results obtained by human search engine evaluation. The results obtained, shows 94 % correlation of the two methods which indicate the reliability of automated approach.In general, the proposed method can be described in three steps.Step 1: Retrieve first k_retrieve results of n different video search engines and build the return result set for each written query.Step 2: Determine the level of relevance of each retrieved result from the search enginesStep 3: Evaluating the search engines by computing different evaluation criteria based on decisions on relevance of the retrieved videos by each search engine Clearly, the main part of any evaluation system with the goal of evaluating the accuracy of search engines is the second step. In this paper, we have tried to present a new solution based on the aggregation of votes in order to determine whether a result is relevant or not, as well as its level of relevance. For this purpose, for each query the return results from different search engines are compared with each other, and the result returned by more than m of the search engines (m ; and the result of which their URLs (after the normalization) are similar to the normalized URL from the m1 of the other search engines, are considered as the relevant results. At the second level, the retrieved results will be compared in terms of content. In this way, after calculating the addresslike similarity, all the results are transmitted to the motion vector extraction component to extract and store the motion vector.In the content based similarity algorithm, the set of motion vectors is initially considered as a sequence of motion vector. We, then, try to find the greatest similarity of the smaller sequence with the larger sequence. After this step, we will report the maximum similarity of the two videos. The process of finding the maximum similarity is that we consider a window with a smaller video sequence length. In this window we calculate and hold the similarity of two sequences. In the proposed method, after identifying the similarity between the return results of different search engines, their level is ranked at three different levels: unrelated (0), quantitatively related (1) and related (2). Since Google's search engine is currently the world rsquo;s largest and bestperforming search engine, and most search engines have been compared to it, and are also trying to achieve the same function, the first five Google search engine results are get the minimum relevance, by default, slightly related . Then the similarity module is used to evaluate the similarity of the retrieved n results of the tested search engines.
Keywords	Automatic assessment ,Video search engine ,Persian Web