بررسی کنش افتراقی سوالات و عملکرد در آزمون: مقایسه رگرسیون لجستیک، مدل رش و منتلهنزل

Fa | Ar | En

بررسی کنش افتراقی سوالات و عملکرد در آزمون: مقایسه رگرسیون لجستیک، مدل رش و منتلهنزل


نویسنده	کرمی حسین ,خودی علی
منبع	پژوهش هاي زبانشناختي در زبان هاي خارجي - 1399 - دوره : 10 - شماره : 4 - صفحه:842 -853
چکیده	یکی از ابزارهای بررسی عملکرد آزمون، «کنش افتراقی پرسش‌ها»است (differential item functioning). این روش،‌ می‌تواند عوامل تاثیرگذار بر عملکرد آزمودنی‌ها را پیدا کرده و از بروز سوگیری در آزمون جلوگیری کند. در دو دهه گذشته ، روش‌های بسیاری برای تشخیص پیشنهاد شده است. شمار روش‌های تشخیص عملکرد افتراقی پرسش، گاهی باعث سردرگمی پژوهشگران می‌شود. از سوی دیگر،‌ امکان مقایسه یافته‌های پژوهش‌هایی که با روش‌های مختلف به‌بررسی کنش افتراقی پرسش را پرداخته‌اند، دشوار می‌سازد. پژوهش حاضر به‌بررسی و سنجش نتایج به‌دست‌آمده از سه‌روش تشخیص کنش افتراقی پرسش پرداخته است: مدل رش‌، رگرسیون لجستیک و منتلهنزل (mh). داده‌های استفاده شده در بررسی‌ها، برگرفته از آزمون توانش انگلیسی دانشگاه تهران (utept) است که ‌آزمون با اهمیت ویژه‌ای است و سالانه برای داوطلبان دکترا برگزار می‌شود. تجزیه و بررسی کنش افتراقی یکنواخت با سه روش برگفته نشان‌داد که پرسش‌ها در عملکرد خود تفاوت‌های زیادی ندارند. نتایج تحلیل رگرسیون لجستیک، دو پرسش را برای وجود کنش افتراقی پیدا کرد که مشابه روش منتلهنزل است. همچنین پرسش‌هایی به‌عنوان نشانگر‌های کنش افتراقی قوی در مدل رش شناسایی شده بودند، همان پرسش‌ها بودند که در دو مدل دیگر نیز معرفی شده بودند. نتایج پژوهش حاضر نشان می‌دهد که استفاده از روش‌های مختلف برای بررسی وجود کنش افتراقی پرسش، الزاما نتایج متفاوت ناگزیری را در پی ندارد و می‌توان از هر یک از روش‌های استفاده شده در این پژوهش بهره گرفت.
کلیدواژه	کنش افتراقی، روایی، عدالت آزمون، تبعیض
آدرس	دانشگاه تهران, گروه زبان و ادبیات انگلیسی, ایران, دانشگاه تهران، پردیس کیش, گروه زبان و ادبیات انگلیسی, ایران
پست الکترونیکی	ali.khodi@ut.ac.ir

Differential Item Functioning and test performance: a comparison between the Rasch model, Logistic Regression and MantelHaenszel

Authors	Karami Hosein ,Khodi Ali
Abstract	Differential item functioning(DIF) is considered to be one of the tools for the examination of test performance. This method is capable of finding the factors affecting the subjects’ performance and preventing the occurrence of bias in the test. A plethora of methods for detecting Differential Item Functioning has been suggested during the last couple of decades. The multiplicity of methods for diagnosing DIF sometimes is a confusing issue for researchers and it complicates the comparability of the findings of each method. This study has aimed to investigate the comparability of results from three widely used DIF detection techniques: the Rasch model, Logistic Regression, and MantelHaenszel (MH). The data comes from an administration of the University of Tehran English Proficiency Test (UTEPT) which is a highstakes test administered annually to PhD candidates. An analysis of DIF by the three techniques indicated that the items had not significant differences in their performance. The MantelHansel model flagged two items having DIF just similar to the findings of logistic regression model. Likewise, the items that were detected as strongDIF items in Rasch model were the same as items detected by the two aforementioned models. Thus, it could be stated that logistic regression and Rasch model are among the best models for the assessment of DIF in language tests. It is promising that the application of such methods into the validation process of the tests would increase the quality of assessment and meet the needs for having a fair and justifiable results.
Keywords