تمام متصل به تمام پیچشی: پلی به گذشته

Fa | Ar | En

تمام متصل به تمام پیچشی: پلی به گذشته


نویسنده	امین طوسی محمود
منبع	رايانش نرم و فناوري اطلاعات - 1401 - دوره : 11 - شماره : 1 - صفحه:60 -72
چکیده	در یک دهه‌ی گذشته شبکه‌های پیچشی متعددی برای قطعه‌بندی معنایی تصاویر ابداع شده‌اند که عملکرد بسیار خوبی در تشخیص و برچسب‌زنی اشیاء از خود نشان داده‌اند. عمده‌ی این شبکه‌ها متضمن معماری‌های با اندازه‌ی بزرگ هستند که توانایی آشکارسازی ده‌ها یا صدها دسته‌ی از قبل مشخص را داشته باشند. در بیشتر کاربردها از معماری‌هایی استفاده می‌شود که پس از چند لایه‌ی پیچشی از یک طبقه‌بند معمول برای طبقه‌بندی ویژگی‌های استخراج شده‌ی شبکه استفاده می‌شود. در این نوشتار روش تبدیل یک شبکه که به عنوان طبقه‌بند، دو لایه‌ی مسطح و چگال (تمام متصل) دارد، به ‌یک شبکه تمام پیچشی بیان شده است. مزیت اصلی این شیوه، قابلیت کارکرد بر روی ورودی‌های با اندازه متغیر و تولید یک نقشه خروجی به جای یک عدد می‌باشد که همان مزیت شبکه‌های تمام پیچشی است. در مدل‌های جدید حوزه‌‌ی یادگیری عمیق عموماً از تصاویر آموزشی که در آنها نواحی موردنظر با ماسک مشخص شده‌اند استفاده می‌شود، اما در شیوه‌ی پیشنهادی در این نوشتار فقط تصاویر برچسب‌دار (مشخص‌کننده طبقه‌ی کل تصویر) به شبکه داده می‌شود. جزییات روش کار در قالب مسئله‌ی جدید طبقه‌بندی و شناسایی تابلوهای با رسم‌الخطهای شکسته نستعلیق و ثلث، شناسایی برگ سالم از مریض سیب (به عنوان مسائل دو کلاسه) و مسئله‌ی شناسایی ارقام فارسی بیان شده است. به این منظور ابتدا یک شبکه پیچشی با لایه آخر تمام متصل طراحی و بر روی تصاویر مربعی آموزش داده می‌شود. سپس مدل تمام پیچشی جدیدی بر اساس مدل قبلی تعریف شده و وزنهای مدل قبلی به مدل جدید کپی می‌شود. تنها تفاوت دو مدل در لایه آخر است، اما مدل جدید قابلیت کار بر روی تصاویر ورودی با هر اندازه را خواهد داشت. نتایج آزمایشات کارایی این شیوه را نشان داده است (کد برنامه در https://github.com/mamintoosi/fc2fc ).
کلیدواژه	یادگیری عمیق، شبکه‌های عصبی پیچشی، طبقه‌بندی تصویر، شناسایی اشیاء، پرسپترون چند لایه
آدرس	دانشگاه حکیم سبزواری, دانشکده ریاضی و علوم کامپیوتر, ایران
پست الکترونیکی	m.amintoosi@hsu.ac.ir

fully connected to fully convolutional: road to yesterday

Authors	amintoosi mahmood
Abstract	in the last decade, several convolutional networks have been developed for the semantic segmentation, which have shown excellent performance in recognizing and labeling objects in images. most of these networks involve large-scale architectures that can detect tens or hundreds of predefined classes. with the exception of fully convolutional networks, most applications use architectures that, after convolutional layers, use a common classifier to classify the extracted features. in this paper, the method of converting a network, which as classifier, has two flatten and dense layers (fully connected), to a fully convolutional network is described. the main advantage of this method is the ability to work on inputs of variable size and produce an output map instead of a number, which is the advantage of fully convolutional networks. newer models of the deep learning area generally use training images in which areas of interest are determined by masks; but in the proposed method only labeled images are given to the network. the details of the proposed method are expressed in the form of a new problem of classification of boards with calligraphy of shekasteh-nastaliq and suls, and classification of apple leaf diseases (as two-class problems) and the problem of identifying hand written persian digits. for this purpose, first a convolutional network with the last fully connected layer is designed and trained for square images. then a new fully convolutional model is defined based on the previous model and the weights of the previous model are fed to the new model. the only difference between the two models is in the last layer, but the new model will be able to work on input images of any size. experimental results show the efficiency of the proposed approach.
Keywords	deep learning ,convolutional neural networks ,image classification ,object detection ,multilayer perceptron