divergent twins fencing: protecting deep neural networks against query-based black-box adversarial attacks

Fa | Ar | En

divergent twins fencing: protecting deep neural networks against query-based black-box adversarial attacks


نویسنده	farshadfar elahe ,sadeghzadeh mesgar amir mahdi ,jalili rasool
منبع	the isc international journal of information security - 2025 - دوره : 17 - شماره : 2 - صفحه:137 -150
چکیده	Recent advances in machine learning and deep learning have significantly expanded their applications in various domains. the resource-intensive process of training deep neural networks, in terms of substantial labeled data acquisition and computational power, makes these models valuable intellectual property for organizations, hence rising an increasingly crucial need for securing them. a major security threat to deep neural networks is the adversarial examples problem, specifically the black-box type. in these attacks, adversaries generate inputs with often imperceptible crafted perturbations to deceive the model into incorrect classifications, all with no access to the model internals and solely by interacting with it via queries and responses. among the two primary methods for creating black-box adversarial examples i.e. model extraction-based and query-based approaches, this research focuses on the query-based type, and it explores a novel defense mechanism to mitigate their success. our proposed method called divergent twins fencing (dtf), employs two subtly different models trained with two different loss functions to incline the execution burden of these attacks. the evaluation criteria for this defense method include measuring the success rate and the average number of queries required to generate adversarial examples using two of the most potent attack methodsfrom recent studies and comparing its defense performance against a leading defense strategy in the literature, i.e., random noise defense (rnd) method, demonstrating our method’s efficacy in enhancing model security against black-box adversarial attacks.
کلیدواژه	adversarial examples ,black-box threat model ,deep neural networks ,model extraction attacks ,query-based attacks ,security and privacy of machine learning
آدرس	sharif university of technology, department of computer engineering, iran, sharif university of technology, department of computer engineering, iran, sharif university of technology, department of computer engineering, iran
پست الکترونیکی	jalili@sharif.edu



Authors