Pages 284-295, Language: EnglishHatch, John P. / Rugh, John D. / Sakai, Shiro / Prihoda, Thomas J.Aims: To examine various dimensions of reliability of the Craniomandibular Index, a commonly used instrument for quantifying the severity of signs and symptoms of temporomandibular disorders.
Methods: Classical psychometric theory and generalizability theory were used to assess the reliability of data obtained from a calibration study of examiners participating in a multi-site clinical trial and from a random community sample.
Results: The reliability of aggregate scores formed by summing individual binary scored items was high, with intraclass correlations ranging from 0.81 to 0.88. When it was required that examiners recognize and agree upon a specific pattern of signs and symptoms exhibited by a patient, however, reliability dropped dramatically (multivariate kappas ranged from 0.26 to 0.32). A group of practicing examiners also showed limited ability to agree with the pattern of signs and symptoms identified by a "gold standard" examiner (multivariate kappas ranging from 0.25 to 0.32). Generalizability analysis failed to identify the specific sources of measurement error that played a major role in limiting reliability but demonstrated that generalizability of aggregate scores was very high.
Conclusion: Methods of classical psychometric theory and generalizability theory support the conclusion that the reliability of aggregate scores is acceptably high. Individual items assessing certain aspects of jaw mobility and joint sounds are measured with poor reliability. Reliability declines when it is defined as the ability of examiners to agree among themselves upon a specific constellation of signs and symptoms or their ability to identify correctly a "correct" constellation identified by an expert examiner.
Keywords: Craniomandibular Index, temporomandibular disorders, reproducibility of results, psychometrics, generalizability theory