استخراج ویژگی جهت شناسایی ترافیک شبکه با درنظر‌گرفتن اثرات اتلاف بسته‌ها

Fa | Ar | En

استخراج ویژگی جهت شناسایی ترافیک شبکه با درنظر‌گرفتن اثرات اتلاف بسته‌ها


نویسنده	گندمی محمدرضا ,حسن پور حمید
منبع	پردازش علائم و داده ها - 1398 - دوره : - شماره : 4 - صفحه:3 -16
چکیده	شناسایی ترافیک شبکه یکی از نیازهای اساسی مدیران جهت کنترل شبکه، برای بهبود کیفیت خدمات‌دهی و حفظ امنیت در شبکه است. یکی از چالش‌ های اساسی در روش ‌های مبتنی بر تحلیل آماری بسته ها، شناسایی ترافیک شبکه، مساله از دست‌دادن (اتلاف) بسته ‌ها است که استفاده از ویژگی ‌های آماری در تحلیل ترافیک شبکه را با مشکل جدی روبه‌رو می ‌سازد. این مساله، ویژگی ‌های آماری بسته ‌ها نظیر فاصله زمانی بین ارسال بسته‌ های متوالی برنامه ‌های کاربردی را تحت تاثیر قرار می ‌دهد، و در مواردی دقت شناسایی ترافیک را به میزان قابل توجهی کاهش می ‌دهد. هدف اصلی این مقاله بررسی تاثیرات اتلاف بسته ‌ها بر روی ویژگی ‌های آماری، و در نتیجه دقت شناسایی برنامه‌های کاربردی، و همچنین استخراج ویژگی‌ های مناسب جهت چیره‌شدن بر این تاثیرات است. بدین منظور، رفتار چهار ویژگی آماری، مورد بررسی قرار گرفته و با استخراج ویژگی از توزیع آنها ترافیک شبکه شناسایی می ‌شود. به همین منظور پایگاه داده‌ ای از ترافیک هفت برنامه کاربردی با نرخ ‌های مختلفی از اتلاف بسته، تهیه شده و میزان صحت تشخیص برنامه ‌های کاربردی به‌وسیله شبکه عصبی، مورد تحلیل قرار گرفته است. نتایج نشان می ‌دهد که ویژگی‌ های استخراج‌شده در مقابل رخداد اتلاف بسته ‌ها مقاوم بوده و دقت شناسایی ترافیک شبکه را در حالت ‌های مختلف رخداد اتلاف بسته به حالت ایده‌ آل (عدم رخداد اتلاف بسته در شبکه) نزدیک می ‌کند.
کلیدواژه	ترافیک شبکه، شناسایی ترافیک شبکه، یادگیری ماشین، از دست دادن بسته
آدرس	دانشگاه صنعتی شاهرود, دانشکده مهندسی کامپیوتر و فناوری اطلاعات, گروه هوش مصنوعی, ایران, دانشگاه صنعتی شاهرود, دانشکده مهندسی کامپیوتر و فناوری اطلاعات, گروه هوش مصنوعی, ایران
پست الکترونیکی	h.hassanpour@shahroodut.ac.ir

Feature Extraction to Identify Network Traffic with Considering Packet Loss Effects

Authors	Gandomi Mohammadreza ,Hassanpour Hamid
Abstract	There are huge petitions of network traffic coming from various applications on Internet. In dealing with this volume of network traffic, network management plays a crucial rule. Traffic classification is a basic technique which is used by Internet service providers (ISP) to manage network resources and to guarantee Internet security. In addition, growing bandwidth usage, at one hand, and limited physical capacity of communication lines, at the other hand, lead providers to improve utilization quality of network resources. In fact, classification or identification of network is a critical task in network processing for traffic management, anomaly detection, and also to improve network qualityofservice (QoS). Port and payload based methods are two classical techniques which are applicable under traditional network conditions. However, many Internet applications use dynamic port numbers for communications, which lead to difficulties in identifying traffic using port numbers. Also many applications encrypt the data before transmitting to avoid detection. Therefore, payloadbased techniques are inefficient for these traffics. In recent years, statistical featurebased traffic flow identification methods (STFIM) have attracted the interest of many researchers. The most important part of a STFIM is the selection of efficient statistical features. Preliminary analysis shows that the problem of packet loss in data transmission is one of the major challenges in employing STFIM for network traffic identification. This affects the statistical characteristics of packets, such as the time interval between sending successive application packets, and in some cases significantly reduces the accuracy of traffic identification. The main goal of this paper is to examine the effects of packet loss on statistical features, and therefore the accuracy of identifying applications, as well as extracting appropriate features to overcome these effects. For this purpose, the behavior of four statistical features, including the packet size, the time interval between sending and receiving packets, the duration of the flows and the rate of sending packets, are investigated; then applications traffics are identified via considering characteristics of their distribution. We collected a database of network traffic flow from seven applications with different rates of packet loss. We used the extracted features in a multilayer neural network, as a classifier, to differentiate between different traffic applications. Experimental results show that the extracted features are robust against the packets loss, and the accuracy of the network traffic identification is close to the ideal state (traffic flow with no packet lost).
Keywords	Network Traffic ,Network traffic Identification ,Machine Learning ,Packet Loss