ادغام تصاویر مادون قرمز و مرئی با استفاده از معماری چندمقیاسی عمیق

Fa | Ar | En

ادغام تصاویر مادون قرمز و مرئی با استفاده از معماری چندمقیاسی عمیق


نویسنده	خسروی سارا
منبع	دريا فنون - 1403 - دوره : 11 - شماره : 2 - صفحه:31 -49
چکیده	ادغام تصویر یک تکنیک مهم در پردازش تصویر با هدف تولید یک تصویر واحد حاوی ویژگی‌های برجسته و اطلاعات تکمیلی از تصاویر منبع، با استفاده از روش‌های استخراج ویژگی مناسب و استراتژی‌های ادغام است. در سال‌های اخیر، روش‌های مبتنی‌بر یادگیری عمیق، پتانسیل بسیار زیادی را در کاربردهای ادغام تصویر نشان داده‌اند و پژوهشگران متعددی در تلاش هستند، با استفاده از روش‌های یادگیری عمیق، کیفیت مطلوب‌تری از تصویر ادغام‌شده را ارائه کنند. در این پژوهش، شبکه عصبی پیچشی چندمقیاسی، جهت ادغام تصاویر مرئی و مادون‌قرمز به نام، mscnn-vif پیشنهاد شده است که در آن، ویژگی‌ها و اطلاعات مختلف موجود در تصاویر، با استفاده بلوک‌های پیچشی ادغام می‌شود. علاوه‌براین، مدل پیشنهادی، شامل معماری چندمقیاسی(ms)، برای اسکن بهتر قسمت‌های مختلف تصویر است که هدف اصلی آن‌ها بهبود عملکرد سامانه ادغام تصویر پیشنهادی است. به‌ طورکلی، مدل پیشنهادی، از یک رمزگذار و یک رمزگشا تشکیل شده است و شامل سه بخش اصلی: استخراج ویژگی، ادغام و بازسازی تصویر مرئی و تصویر مادون قرمز است. در این روش، تصاویر مادون قرمز و مرئی به رمزگذار داده‌ می‌شود و نگاشت‌های پس‌زمینه و جزئیات نگاشت‌های ویژگی تولید می‌گردد. سپس، شبکه دو نوع لایه را در امتداد کانال‌ها به هم متصل می-کند. در نهایت، نگاشت‌های ویژگی به هم پیوسته از رمزگشا عبور می‌کنند تا تصویر اصلی بازیابی شود. متفاوت از مرحله آموزش، در مرحله آزمایش، یک لایه ادغام قرار می‌گیرد که نگاشت‌های پس‌زمینه و ویژگی‌های جزئیات را به طور جداگانه ادغام می‌کند. همچنین، روش‌ پیشنهادی برروی سه پایگاه‌داده‌ شناخته شده و در دسترس، آزمایش شده است. نتایج به‌دست آمده نشان می‌دهد که روش‌ پیشنهادی عملکرد بهتری براساس معیارهای ارزیابی مختلف، نسبت به روش‌های دیگر از خود نشان داده است.
کلیدواژه	ادغام تصاویر، یادگیری عمیق، شبکه‌های عصبی پیچشی، چندمقیاسی، رمزگذار، رمزگشا
آدرس	دانشگاه پیام نور مرکز تهران, دانشکده فنی و مهندسی, گروه کامپیوتر, ایران
پست الکترونیکی	khosravi_un@pnu.ac.ir

fusion of visible and infrared images using deep multiscale architecture

Authors	khosravi sara
Abstract	image fusion is an important technique in image processing with the aim of producing a single image containing salient features and complementary information from source images, using appropriate feature extraction methods and fusion strategies. in recent years, methods based on deep learning have shown great potential in image integration applications, and many researchers are trying to provide better quality of the integrated image using deep learning methods. in this research, a multi-scale convolutional neural network is proposed to integrate visible and infrared images called mscnn-vif, in which the various features and information in the images are integrated using convolutional blocks. in addition, the proposed model includes a multi-scale (ms) architecture for better scanning of different parts of the image, whose main goal is to improve the performance of the proposed image integration system. in general, the proposed model consists of an encoder and a decoder and includes three main parts: feature extraction, integration and reconstruction of visible image and infrared image. in this method, infrared and visible images are given to the encoder and background maps and details of feature maps are produced. then, the network connects two types of layers along the channels. finally, the concatenated feature maps are passed through the decoder to recover the original image. different from the training phase, in the testing phase, an integration layer is placed that integrates background maps and detail features separately. also, the proposed method has been tested on three known and available databases. the obtained results show that the proposed method has performed better than other methods based on different evaluation criteria.