帮助,如何用Wiki (新窗口)
Kuder-Richardson Correction

All tests contain errors. This is true for tests in physical science and education and in psychological tests. Reliability is always a concern when giving tests. A test can be unreliable and still be valid and another test may be unreliable and invalid. Validity is defined as the agreement between a test score and the quality it is believed to measure or more simply worded, it measures the gap between what a test actually measures and what it is intended to measure. Reliability is a measure of how much a test results can be trusted. A test may be highly reliable at the expense of validity or in other words, you may get the same results every time, but it does not tell you what you need to know. Stability is also a concern in the reliability of a test and is a measure of repeatability over a period of time in which the results remain constant, or in other words, if you give a test time after time the results should be the same. Consistency also plays a part in reliability and is defined as a measure of reliability through similarity within the test, with individual questions giving predictable answers every time. In other words, certain questions should have predicable answers time after time. Therefore, a number of reliable test formulas have been developed such as the Split-Half method, the Spearman- Brown formula, the Cronbach alpha formula, and the Kuder-Richardson formula. We will discuss the Kuder-Richardson formula.

The Kurder-Richardson formula was developed in 1937 in order to measure internal consistency reliability in a single administration of a single test. It is a formula for calculating how consistent people’s responses are among the questions on a test. All items are compared with each other, rather than half of the items with the other half of the items. There are two forms of the Kurder-Richardson, K-R 20 and K-R 21. K-R 20 formula is used for knowledge test in which test answers would be coded 1 for a correct answer and coded 0 for an incorrect answer. The assessment would be a Forced-Choice Assessment such as a computer based test. The K-R 21 formula would be used when technological assistance is not available. The K-R 21 formula is stated to simpler but sometimes less precise than the K-R 20 formula, but is a generally acceptable measure of internal consistency. The K-R 20 formula can be used in a computer model that can be administered online or in an Excel format. K-R 21 assumes that all of the questions are equally difficult and K-R 20 does not. An example of K-R 20 follows.

Kuder-Richardson's ρ is used with dichotomous variables. Each column represents one of the items on the test; each row represents scores taken from a given subject. Two or more items are required (k >= 2), and scores from each item should be stored in a separate data column. If the variables that represent the scores are not already in a binary form, use the rules for treating numeric and textual as binary. The first image below shows part of a data set from a test with k = 12 items and n = 10. The bottom image below displays the output table providing the ρ (K-R 20) and ρ (K-R 21), estimated for the full data set (Gigawiz, 2010).


Gigawiz .(2010), Retrieved from /reliability.html#KRFormula

Siegle, D.(2010) Relationship of Test Forms and Testing Sessions Required for Reliability Procedures, Neag School of Education, University of Connecticut Retrievedfrom Siegle/research/Instrument%20Reliab ility%20and%20Validity/Reliability.htm

Bolton, F., Parker, R. (2008). Handbook of Measurement and Evaluation in Rehabilitation, (pp 46-47). Austin,Tx.: Pro-Ed International Publisher