Activate Activate Activate
contact  
Hello. Sign in to personalize your visit. New user? Register now.  

In
By author
Computational Linguistics

Quarterly (March, June, September, December)
160 pp. per issue
6 3/4 x 10
Founded: 1974
ISSN 0891-2017
E-ISSN 1530-9312
2008 ISI Impact Factor: 2.656

Computational Linguistics

March 2004, Vol. 30, No. 1, Pages 95-101
Posted Online March 13, 2006.
(doi:10.1162/089120104773633402)
© 2004 Association for Computational Linguistics

The Kappa Statistic: A Second Look

Barbara Di Eugenio

University of Illinois at Chicago, Computer Science, 1120 SEO (M/C 152), 851 South Morgan Street, Chicago, IL 60607. E-mail:

Michael Glass

Valparaiso University, Mathematics and Computer Science, 116 Gellerson Hall, Valparaiso, IN 46383. E-mail:



PDF (77.114 KB) PDF Plus (115.195 KB)



In recent years, the kappa coefficient of agreement has become the de facto standard for evaluating intercoder agreement for tagging tasks. In this squib, we highlight issues that affect κ and that the community has largely neglected. First, we discuss the assumptions underlying different computations of the expected agreement component of κ. Second, we discuss how prevalence and bias affect the κ measure.

Cited by

Ron Artstein, Massimo Poesio. (2008) Inter-Coder Agreement for Computational Linguistics. Computational Linguistics 34:4, 555-596
Online publication date: 1-Dec-2008.
Abstract | PDF (265 KB) | PDF Plus (285 KB) 
Trevor Cohn, Chris Callison-Burch, Mirella Lapata. (2008) Constructing Corpora for the Development and Evaluation of Paraphrase Systems. Computational Linguistics 34:4, 597-614
Online publication date: 1-Dec-2008.
Abstract | PDF (394 KB) | PDF Plus (397 KB) 
Dennis Reidsma, Jean Carletta. (2008) Reliability Measurement without Limits. Computational Linguistics 34:3, 319-326
Online publication date: 1-Sep-2008.
Abstract | PDF (186 KB) | PDF Plus (149 KB) 
Raquel Fernández, Jonathan Ginzburg, Shalom Lappin. (2007) Classifying Non-Sentential Utterances in Dialogue: A Machine Learning Approach. Computational Linguistics 33:3, 397-427
Online publication date: 1-Sep-2007.
Abstract | PDF (198 KB) | PDF Plus (200 KB) 
Petra Saskia Bayerl, Karsten Ingmar Paul. (2007) Identifying Sources of Disagreement: Generalizability Theory in Manual Annotation Studies. Computational Linguistics 33:1, 3-8
Online publication date: 1-Mar-2007.
Abstract | PDF (57 KB) | PDF Plus (60 KB) 
Melita Hajdinjak, France Mihelič. (2006) The PARADISE Evaluation Framework: Issues and Findings. Computational Linguistics 32:2, 263-272
Online publication date: 1-Jun-2006.
Abstract | PDF (78 KB) | PDF Plus (80 KB) 
Richard Craggs, Mary McGee Wood. (2005) Evaluating Discourse and Dialogue Coding Schemes. Computational Linguistics 31:3, 289-296
Online publication date: 1-Sep-2005.
Abstract | PDF (52 KB) | PDF Plus (54 KB) 
Technology Partner - Atypon Systems, Inc.
  CrossRef member COUNTER member