A simulation study of rater agreement measures with 2x2 contingency tables

  1. Ato García, Manuel
  2. López García, Juan José
  3. Benavente Reche, Ana
Zeitschrift:
Psicológica: Revista de metodología y psicología experimental

ISSN: 1576-8597

Datum der Publikation: 2011

Ausgabe: 32

Nummer: 2

Seiten: 385-402

Art: Artikel

Andere Publikationen in: Psicológica: Revista de metodología y psicología experimental

Zusammenfassung

Bibliographische Referenzen

  • Agresti, A. (1992). Modelling patterns of agreement and disagreement. Statistical Methods in Medical Research, 1, 201-218.
  • Agresti, A.; Ghosh, A. & Bini, M. (1995). Raking kappa: Describing potential impact of marginal distributions on measure of agreement. Biometrical Journal, 37, 811-820.
  • Agresti, A. (2002). Categorical Data Analysis. 2nd Edition. New York, NY: Wiley.
  • Aickin, M. (1970). Maximum likelihood estimation of agreement in the constant predictive probability model, and its relation to Cohen's kappa. Biometrics, 46, 293-302.
  • Ato, M.; Benavente, A. y Lopez, J.J. (2006). Analisis comparativo de tres enfoques para evaluar el acuerdo entre observadores. Psicothema, 18, 638-645.
  • Bennet, E.M.; Alpert, R. & Goldstein, A.C. (1954). Communications through limited response questioning. Public Opinion Quarterly, 18, 303-308.
  • Bloch, D.A. & Kraemer, H.C. (1989). 2 2222 kappa coefficients: measures of agreement or association. Biometrics, 45, 269-287.
  • Brennan, R.L. & Prediger, D. (1981). Coefficient kappa: some uses, misuses and alternatives. Educational and Psychological Measurement, 41, 687-699.
  • Byrt, T.; Bishop, J. & Carlin, J.B. (1993). Bias, prevalence and kappa. Journal of Clinical Epidemiology, 46, 423-429.
  • Carrasco, J.L. & Jover, LI. (2003). Estimating the generalized concordance correlation coefficient through variance components. Biometrics, 59, 849-858.
  • Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20, 37-46.
  • Cronbach, L.J.; Gleser, G.C. & Rajaratnam, J. (1972). The Dependability of Behavioral Measurements. New York, NY: Wiley.
  • Darroch, J.M. & McCloud, P.I. (1986). Category distinguishability and observer agreement. Australian Journal of Statistics, 28, 371-388.
  • Dunn, C. (1989). Design and Analysis of Reliability Studies: the statistical evaluation of measurement errors. 2nd Edition. London: Arnold.
  • Feinstein, A. & Cichetti, D. (1990). High agreement but low kappa: I. The problem of two paradoxes. Journal of Clinical Epidemiology, 43, 543-549.
  • Fleiss, J.L.; Cohen, J. & Everitt, B.S. (1969). Large sample standard errors of kappa and weighted kappa. Psychological Bulletin, 72, 323-327.
  • Graham, P. (1995). Modelling covariate effects in observer agreement studies: the case of nominal agreement. Statistics in Medicine, 14, 299-310.
  • Guggenmoos-Holtzmann, I. (1993). How reliable are chance-corrected measures of agreement. Statistics in Medicine, 12, 2191-2205.
  • Guggenmoos-Holtzmann, I. & Vonk, R. (1998). Kappa-like indices of observer agreement viewed from a latent class perspective. Statistics in Medicine, 17, 797-812.
  • Gwet, K. (2001). Handbook of inter-rater reliability. Gaithersburg, MA: Stataxis.
  • Gwet, K. (2008). Computing inter-rater reliability and its variance in presence of high agreement. British Journal of Mathematical & Statistical Psychology, 61, 29-48.
  • Hoehler, F.K. (2000). Bias and prevalence effects on kappa viewed in terms of sensitivity and specificity. Journal of Clinical Epidemiology, 53, 499-503.
  • Holley, W. & Guilford, J.P. (1964). A note on the G-index of agreement. Educational and Psychological Measurement, 24, 749-753.
  • Hsu, L.M. & Field, R. (2003). Interrater agreement measures: comments on kappan, Cohen's kappa, Scott's CCand Aickin's aa. Understanding Statistics, 2, 205-219.
  • Janson, S. & Vegelius, J. (1979). On generalizations of the G-index and the phi coefficient to nominal scales. Multivariate Behavioral Research, 14, 255-269.
  • Lantz, C.A. & Nebenzahl, E. (1996). Behavior and interpretation of the LLstatistics: resolution of the two paradoxes. Journal of Clinical Epidemiology, 49, 431-434.
  • Lin, L.; Hedayat, A.S.; Sinha, B. & Yang, M. (2002). Statistical methods in assessing agreement: models, issues and tools. Journal of the American Statistical Association, 97, 257-270.
  • Martin, A. & Femia, P. (2004). Delta: a new measure of agreement between two raters. British Journal of Mathematical and Statistical Psychology, 57, 1-19.
  • Martin, A. and Femia, P. (2008). Chance-corrected measures of reliability and validity in 2 x 2 tables. Communications in Statistics-Theory and Methods, 37, 760-772.
  • Martin, A. & Luna, J.D. (1989). Tests and intervals in multiple choice tests: a modification of the simplest classical model. British Journal of Mathematical and Statistical Psychology, 42, 251-263.
  • McGraw, K.O. & Wong, S.P. (1996). Forming inferences about some intraclass correlation coefficients. Psychological Methods, 1, 30-46.
  • Maxwell, A.E. (1977). Coefficients of agreement between observers and their interpretation. British Journal of Psychiatry, 116, 651-655.
  • Schuster, C. (2002). A mixture model approach to indexing rater agreement. British Journal of Mathematical and Statistical Psychology, 55, 289-303.
  • Schuster, C. & von Eye, A. (2001). Models for ordinal agreement data. Biometrical Journal, 43, 795-808.
  • Schuster, C. & Smith, D.A. (2002). Indexing systematic rater agreement with a latentclass model. Psychological Methods, 7, 384-395.
  • Scott, W.A. (1955). Reliability of content analysis: The case of nominal scale coding. Public Opinion Quarterly, 19, 321-325.
  • Shoukri, M.M. (2004). Measures of Interobserver Agreement. Boca Raton, Fl. CRC Press.
  • Shrout, P.E, & Fleiss, J.L. (1973). Intraclass correlations: uses in assessing rater reliability. Psychological Bulletin, 2, 420-428.
  • Tanner, M.A. & Young, M.A. (1985a). Modeling agreement among raters. Journal of the American Psychological Association, 80, 175-180.
  • Tanner, M.A. & Young, M.A. (1985b). Modeling ordinal scale disagreement. Psychological Bulletin, 98, 408-415.
  • Von Eye, A. & Mun, E.Y. (2005). Analyzing Rater Agreement. Mahwah, NJ: Lawrence Erlbaum Associates.
  • Zwick, R. (1988). Another look at interrater agreement. Psychological Bulletin, 103, 374-378.