Comparison of three software programs for evaluating DIF by means of the Mantel-Haenszel procedureEASY-DIF, DIFAS and EZDIF
- José Luis Padilla 1
- Mª Dolores Hidalgo 2
- Isabel Benítez 1
- Juana Gómez-Benito 3
- 1 University of Granada, Spain
- 2 University of Murcia, Spain
- 3 University of Barcelona, Spain
ISSN: 1576-8597
Año de publicación: 2012
Volumen: 33
Número: 1
Páginas: 87-105
Tipo: Artículo
Otras publicaciones en: Psicológica: Revista de metodología y psicología experimental
Resumen
The analysis of differential item functioning (DIF) examines whether item responses differ according to characteristics such as language and ethnicity, when people with matching ability levels respond differently to the items. This analysis can be performed by calculating various statistics, one of the most important being the Mantel-Haenszel, which can be carried out with software programs such as EZDIF, DIFAS and, more recently, EASY-DIF. In this context, the aim of the present study is to compare these three software programs by using simulated and real data. The procedural characteristics and the results obtained from the same dataset were thus compared by the three programs. DIFAS and EASY-DIF always provide equivalent results, while EZDIF is less accurate when using the thin matching strategy. The results also showed that DIFAS and EASY-DIF were the easiest to run, especially for testing practitioners, with the second offering a broader range of results for key characteristics for detecting DIF.
Referencias bibliográficas
- Breslow, N. E., & Day, N. E. (1980). Statistical methods in cancer research: Volume 1-The analysis of case-control studies. Lyon: International Agency for Research on Cancer.
- Clauser, B.E., Nungester, R.J., Mazor, K. & Ripkey, D. (1996). A comparison of alternative matching strategies for DIF detection in tests that are multidimensional. Journal of Educational Measurement, 33, 202-214.
- Dorans, N.J., & Holland, P.W. (1993). DIF detection and description: Mantel-Haenszel and Standarization. En P.W.Holland y H.Wainer (Eds.) Differential Item Functioning (pp. 35-66) Hillsdale, NJ: Erlbaum.
- Goldberg, D. (1972). The Detection of Psychiatric Illnes by Questionnaire. Windsor. National Foundation for Educational Research.
- González, A.; Padilla, J.L.; Hidalgo, M.D. Gómez-Benito, J. & Benítez, I. (2011). EASYDIF: Software for analysing differential item functioning using the Mantel-Haenszel and standardization procedures. Applied Psychological Measurement, 35, 483-484.
- Guilera, G.; Gómez-Benito, J. & Hidalgo, M.D. (2009). Scientific production on the Mantel-Hanszel procedure as a way of detecting DIF. Psicothema, 21 (3), 492-498.
- Hidalgo, M. D., & Gómez-Benito, J. (2010). Education measurement: Differential item functioning. In P. Peterson, E. Baker, & B. McGaw (Eds.), International Encyclopedia of Education (3rd edition). USA: Elsevier - Science & Technology.
- Holland, P. W., & Thayer, D. T. (1988). Differential item functioning and the Mantel-Haenszel procedure. In H. Wainer & H. I. Braun (Eds.), Test validity (pp. 129-145). Hillsdale, NJ: Lawrence Erlbaum.
- Spanish Ministry of Health and Social Policies. National Health Survey 2006: <http://www.msps.es/estadEstudios/estadisticas/encuestaNacional/encuesta2006.htm> [Check: March 3, 2011].
- Mantel, N. (1963). Chi-square tests with one degree of freedom, extension of the Mantel-Haenszel procedure. American Statistical Association Journal, 58, 690-700.
- Mantel, N. & Haenszel, W. (1959). Statistical aspects of the analysis of data from retrospective studies of disease. Journal of the -ational Cancer Institute, 22, 719-748.
- Mellenbergh, G. J. (1982). Contingency table models for assessing item bias. Journal of Educational Statistics 7, 105-118.
- Millsap, R. E. & Everson, H. T. (1993). Methodology review: Statistical approaches for assessing measurement bias. Applied Psychological Measurement 17, 297-334.
- Narayanan, P. y Swaminathan, H. (1994). Performance of the Mantel-Haenszel and simultaneous item bias procedures for detecting differential item functioning. Applied Psychological Measurement, 18(3) 15-328.
- Penfield, R. D. (2003). Application of the Breslow-Day test of trend in odds ratio heterogeneity to the detection of nonuniform DIF. Alberta Journal of Educational Research, 49, 231-243.
- Penfield, R. D. (2005). DIFAS: Differential Item Functioning Analysis System. Applied Psychological Measurement, 29, 150-151.
- Potenza, M. T. & Dorans, N. J. (1995). DIF assessment for polytomously scored items: A framework for classification and evaluation. Applied Psychological Measurement, 19, 23-37.
- Waller, N. G. (1998). EZDIF: Detection of Uniform and Nonuniform Differential Item Functioning With the Mantel-Haenszel and Logistic Regression Procedures. Applied Psychological Measurement, 22, 391.
- Zwick, R. and Ercikan, K. (1989). Analysis of differential item functioning in the NAEP history assessment. Journal of Educational Measurement, 26, 55-66.