یادگیری برخط داده‌های جریانی نامتوازن دارای رانش مفهوم به وسیله نظریه باور و تابع آشوب

Fa | Ar | En

یادگیری برخط داده‌های جریانی نامتوازن دارای رانش مفهوم به وسیله نظریه باور و تابع آشوب


نویسنده	حمیدزاده جواد ,رشیدی محمودی محمدعلی ,مرادی منا
منبع	پردازش علائم و داده ها - 1402 - شماره : 4 - صفحه:23 -34
چکیده	خصوصیات دادهه ای جریانی در گذر زمان خصوصیات داده‌های جریانی ناپایدار بوده و توزیع طبقات متحمل تغییرات می‌گردند بنابراین مدل‌های یادگیری غالباً نیاز به تطبیق با رانش مفاهیم دارند. در این مقاله، با هدف حل دو چالش عدم توازن میان طبقات مشاهده‌شده و وقوع رانش مفهوم، طبقه‌بند داده‌های جریانی نامتوازن دارای رانش مفهوم ارائه شده است. روش پیشنهادی سعی در حذف داده‌های جریانی مرزی و نویزی با کمک خوشه‌بندی دارد. داده‌ها با کمک تابع باور وزن دهی شده و با در نظر گرفتن برچسب داده‌ها، نمونه افزایی در نواحی کم تراکم طبقه اقلیت و با رویکرد آشوبی انجام می‌گیرد. سپس، با تعریف حد آستانه، رانش مفهوم شناسایی می‌شود. پیش‌بینی برچسب توسط طبقه‌بند ترکیبی و رای گیری وزن‌دار اکثریت انجام می‌پذیرد. عملکرد روش پیشنهادی بر روی مجموعه داده‌های پایگاه داده uci توسط روش loo ارزیابی و با طبقه‌بندهای مرز دانش مقایسه شده است. نتایج آزمایش‌ها نشان‌دهنده برتری روش پیشنهادی از نظر معیارهای ارزیابی است.
کلیدواژه	تغییر مفهوم، داده جریانی، داده نامتوازن، طبقه‌بندی برخط، نظریه باور
آدرس	دانشگاه سجاد, دانشکده مهندسی کامپیوتر و فناوری اطلاعات, ایران, دانشگاه سجاد, دانشکده مهندسی کامپیوتر و فناوری اطلاعات, ایران, دانشگاه سجاد, دانشکده مهندسی کامپیوتر و فناوری اطلاعات, ایران
پست الکترونیکی	monamoradi0@gmail.com

online learning for imbalanced data streams with concept drift by belief theory and chaotic function

Authors	hamidzadeh javad ,rashidi mahmoodi mohammad ali ,moradi mona
Abstract	continual learning from data streams is a pivotal aspect of machine learning, requiring the development of algorithms capable of adapting to incoming data. however, the ongoing evolution of data streams presents a formidable challenge as previously acquired knowledge may become outdated. this challenge, known as concept drift, demands timely detection for the effective adaptation of learning models. while various drift detectors have been proposed, they often assume a relatively balanced class distribution. in scenarios with imbalanced data streams, these detectors may exhibit bias toward majority classes, overlooking shifts in minority classes. moreover, the imbalance among classes can change over time, with roles shifting between majority and minority classes, especially when relationships among classes become complex due to overlapping regions. in this paper, a novel classification method is introduced for imbalanced streaming data affected by concept drift. the proposed method continuously monitors arriving streams to detect and adapt to both imbalances and concept drift. upon receiving a new block of data, the proposed method employs the k-means clustering approach to identify non-dense regions and performs oversampling for minority classes. cluster centers are selected using the belief function to address overlapping issues between majority and minority classes. utilizing a chaotic approach, the new sample is added based on its neighborhood and the size of that neighborhood. subsequently, concept drift detection is conducted using three pre-defined thresholds that cover time intervals and classification errors. finally, the label prediction process is done by ensemble learning and weighted majority voting. experiments conducted on benchmark datasets from the uci database evaluate the performance of the proposed method using leave-one-out (loo) validation and comparisons with state-of-the-art methods. the results demonstrate the superiority of the proposed method across various evaluation criteria, highlighting its effectiveness in addressing imbalanced streaming data with concept drift.
Keywords	belief theory ,concept drift ,data stream ,imbalanced data ,online classification