ارزیابی عملکرد سه مدل یادگیری عمیق در استخراج عوارض ساختمانی از تصاویر هوایی و ماهواره‌ای

Fa | Ar | En

ارزیابی عملکرد سه مدل یادگیری عمیق در استخراج عوارض ساختمانی از تصاویر هوایی و ماهواره‌ای


نویسنده	احمدیان نیما ,صداقت امین ,محمدی نازیلا
منبع	مهندسي فناوري اطلاعات مكاني - 1402 - دوره : 11 - شماره : 1 - صفحه:105 -123
چکیده	ساختمان‌ها به‌عنوان یکی از مهم‌ترین عوارض دست‌ساز بشر، کاربردهای فراوانی در زمینه‌های مختلف داشته و ارزیابی و شناسایی آن‌ها با استفاده از تصاویر هوایی و ماهواره‌ای امری ضروری است. روش‌های مبتنی بر یادگیری عمیق، اخیراً به طور گسترده‌ای برای استخراج عوارض ساختمانی از تصاویر هوایی و ماهواره‌ای به‌صورت خودکار استفاده شده‌اند. شناخت خصوصیات روش‌های مختلف در مقایسه با یکدیگر و برای انواع مختلف از تصاویر با شرایط هندسی و روشنایی متفاوت ضروری است. بدین منظور، در این تحقیق عملکرد سه مدل یادگیری عمیق مطرح شامل mask-rcnn (mask region-based convolutional neural network)، u-net وma-fcn (multi-scale aggregation fully convolutional network) در استخراج عوارض ساختمانی از سه مجموعه داده تصاویر ماهواره‌ای و هوایی با استفاده از معیارهای iou (intersection over :union:)و f1-score بررسی شده است. علاوه بر این در این تحقیق اثر استفاده از مدل رقومی سطح در فرآیند استخراج ساختمان توسط این الگوریتم‌ها نیز بررسی شده است. به طور کلی نتایج حاصل از این تحقیق نشان می‌دهد که علاوه بر نوع مدل، تعداد و کیفیت نمونه‌های آموزشی و استفاده از مدل رقومی سطح نیز در نتایج تاثیرگذار است. همچنین استفاده از مدل رقومی سطح در کنار تصاویر سه‌باندی روش مناسبی برای بهبود عملکرد مدل‌های یادگیری عمیق در استخراج عوارض ساختمانی است. مدل رقومی سطح نتایج حاصل از استخراج ساختمان‌ها را در مدل‌های u-net و ma-fcn به‌ترتیب % 7.46 و % 5.7 در تصاویر ماهواره‌ای و % 3.61 و % 3.34 در تصاویر هوایی در معیار iou بهبود داده است. مدل‌های u-net و ma-fcn به‌دلیل ترکیب ویژگی‌های قسمت رمزگذار با ویژگی‌های قسمت رمزگشا، در مرز ساختمان‌ها دقیق‌تر هستند. مدل mask-rcnn به‌دلیل دارا بودن ساختار resnet در معماری خود به مسئله فرابرازش مقاوم‌تر است.
کلیدواژه	یادگیری عمیق، ساختمان، مدل رقومی سطح، تصاویر ماهواره‌ای، u-net
آدرس	دانشگاه تبریز, دانشکده مهندسی عمران, گروه مهندسی نقشه برداری, ایران, دانشگاه تبریز, دانشکده مهندسی عمران, گروه مهندسی نقشه برداری, ایران, دانشگاه تبریز, دانشکده مهندسی عمران, گروه مهندسی نقشه برداری, ایران
پست الکترونیکی	n.mohammadi@tabrizu.ac.ir

performance evaluation of three deep learning models in building footprint extraction from aerial and satellite images

Authors	ahmadian nima ,sedaghat amin ,mohammadi nazila
Abstract	buildings as one of the important man-made objects have various applications and need to be observed with aerial and satellite images. deep learning models have often been used to automatically extract building footprints from aerial and satellite images. it is essential to evaluate and compare the features of different deep learning models in images with geometric and brightness variations. for this purpose, in this research the performance of three deep learning models called mask-rcnn (mask region-based convolutional neural network), u-net and ma-fcn (multi-scale aggregation fully convolutional network) is evaluated on two aerial and satellite datasets with f1-score and iou metrics. the results of this research indicate that the model, quantity and quality of training samples and digital surface model affect the performance of these models. also, using digital surface models alongside the 3 band rgb images is an effective way of improving the building footprint extraction with deep learning models. by using digital surface model, the iou results of u-net and ma-fcn models in building footprint extraction are increased 7.46% and 5.7% in satellite dataset and 3.61% and 3.34% in aerial dataset, respectively. u-net and ma-fcn are more precise in building boundaries since they concatenate feature maps of encoder and decoder parts in producing final segmentation maps. mask-rcnn is stable to overfitting because of using resnet in its architecture.
Keywords	deep learning ,buildings ,digital surface model ,satellite imagery ,u-net