The formula for r looks like this: |
r value | Strength |
0-.2 | very weak |
.2-.4 | weak |
.4-.6 | moderate |
.6-.8 | strong |
.8-1 | very strong |
X | Y | Correlation |
nominal | nominal | Phi coefficient |
nominal | ordinal | Rank biserial coefficient (rrb) |
nominal | interval/ratio | Point biserial coefficient (rpb) |
ordinal | ordinal | Spearman rank-order coefficient (rho) |
interval/ratio | interval/ratio | Pearson r |
Type of Reliability | Application |
Test-retest | Use this type of reliability estimate
whenever you are measuring a trait over a period of time. Example: teacher job satisfaction during the school year |
Parallel forms | Use this type of reliability estimate
whenever you need different forms of the same test to measure the same
trait. Example: multiple forms of the SAT |
Internal consistency | Use this type of reliability estimate
whenever you need to summarize scores on individual items by an overall
score. Example: combining the 20 items on a statistics test to represent level of knowledge about a particular aspect of statistics |
Interrater | Use this type of reliability estimate
whenever you involve multiple raters in scoring tests. Example: AP essay test grading |
Type of Validity | Application |
Content | Use this type of validity estimate
whenever you are comparing test items with a larger domain of knowledge. Example: assessing the breadth of a comprehensive final exam |
Criterion | Use this type of validity estimate
whenever you are comparing a new test with an established standard. Example: developing a test to predict genius (predictive), or developing a new test comparable to the CAHSEE (high school exit exam) |
Construct | Use this type of validity estimate
whenever you are comparing a test with the elements of a theoretical
definition of a trait. Example: developing a new test for musical intelligence - distinguishing between musical and other types of intelligence |