بهبود شبکه عمیق r-fcn در آشکارسازی و برچسب‎زنی اشیاء

Fa | Ar | En

بهبود شبکه عمیق r-fcn در آشکارسازی و برچسب‎زنی اشیاء


نویسنده	قنبری سرخی علی ,حسن پور حمید ,فاتح منصور
منبع	ماشين بينايي و پردازش تصوير - 1398 - دوره : 6 - شماره : 2 - صفحه:43 -59
چکیده	امروزه آشکارسازی و برچسب‎زنی اشیاء در تصاویر یکی از چالش‎های اساسی در برخی از کاربردهای بینایی‎ماشین می‎باشد. در سال‎های اخیر استفاده از یادگیری عمیق مورد توجه محققان قرار گرفته است. در همین راستا، در این مقاله ابتدا جدیدترین شبکه‎های عمیق موجود معرفی، سپس نقاط قوت و ضعف آنها تحلیل می‌شود. در ادامه شبکه‎ای بهبود یافته از شبکه rfcn ارائه می‎شود. روش پیشنهادی بر پایه معماری resnet و شبکه تمام کانولوشن است. در این روش، معماری جدیدی مبتنی بر شبکه عمیق برای پیشنهاد ناحیه کاندید و روشی ترکیبی مبتنی بر svmفازی دوکلاسه و svr برای آشکارسازی و برچسب‎زنی اشیاء ارائه شده است. در این روش از تابع زیان جدید با عنوان اختلاف کوشیشوارتز استفاده شده است. این تابع زیان از لحاظ سرعت و دقت، عملکرد بهتری از خود نشان داده است. روش پیشنهادی با معماری 101resnet بر روی مجموعه داده sun برای آشکارسازی و برچسب‎زنی 36 شی مورد آزمایش قرار گرفت و نتایج بدست آمده نشان دهنده بهبود عملکرد این روش نسبت به روش پایه شبکه rfcn است. روش پیشنهادی از لحاظ معیار map، عملکرد 48/38% و مدت زمان متوسط برای هر تصویر 0/13 را دارد، و نسبت به بهترین روش در این حوزه تقریبا 2% در عملکرد و 0/4 ثانیه در زمان بهتر عمل کرده است.
کلیدواژه	‏ آشکارسازی و شناسایی اشیاء، r-fcn ,یادگیری عمیق، ماشین بردار پشتیبان دودویی فازی، اختلاف کوشی-شوارتز
آدرس	دانشگاه صنعتی شاهرود, ایران, دانشگاه صنعتی شاهرود, دانشکده مهندسی کامپیوتر, ایران, دانشگاه صنعتی شاهرود, دانشکده مهندسی کامپیوتر, ایران
پست الکترونیکی	mansoor_fateh@shahroodut.ac.ir

Improvement of the RFCN's deep network in object detection and annotation

Authors	Ghanbari Sorkhi Ali ,Hassanpour Hamid ,fateh mansoor
Abstract	Today, the detection and annotation of objects in images is one of the major challenges in some applications of machine vision. In recent years, the use of deep learning has attracted the attention of researchers. In this regard, this paper first introduces the newest deep networks and analyzes the strengths and weaknesses of these methods. An improved network of RFCN network has been presented. The proposed method is based on the ResNet architecture and the fully convolutional network. In this method, a new architecture is proposed based on region proposal deep network and a combined method based on the binary fuzzy SVM and the SVR for final detection and categorization of objects. Also, a new loss function called CauchySchwartz Divergence loss, has been used. This function has shown better performance in terms of speed and accuracy. The proposed ResNet101 architecture was tested on the SUN dataset for the detection and annotation of 36 objects, and the results indicate improved performance of this method compared to the basic RFCN network method. The proposed method, In terms of Mean Average Precision, has 48.38% performance and average duration for each image is 0.13 Compared to the best method in this area, it performed about 2% in performance and 0.04 seconds in better time.
Keywords	RFCN