| The formula for r looks like this: | ![]() |


| r value | Strength |
| 0-.2 | very weak |
| .2-.4 | weak |
| .4-.6 | moderate |
| .6-.8 | strong |
| .8-1 | very strong |
| X | Y | Correlation |
| nominal | nominal | Phi coefficient |
| nominal | ordinal | Rank biserial coefficient (rrb) |
| nominal | interval/ratio | Point biserial coefficient (rpb) |
| ordinal | ordinal | Spearman rank-order coefficient (rho) |
| interval/ratio | interval/ratio | Pearson r |
| Type of Reliability | Application |
| Test-retest | Use this type of reliability estimate
whenever you are measuring a trait over a period of time. Example: teacher job satisfaction during the school year |
| Parallel forms | Use this type of reliability estimate
whenever you need different forms of the same test to measure the same
trait. Example: multiple forms of the SAT |
| Internal consistency | Use this type of reliability estimate
whenever you need to summarize scores on individual items by an overall
score. Example: combining the 20 items on a statistics test to represent level of knowledge about a particular aspect of statistics |
| Interrater | Use this type of reliability estimate
whenever you involve multiple raters in scoring tests. Example: AP essay test grading |
| Type of Validity | Application |
| Content | Use this type of validity estimate
whenever you are comparing test items with a larger domain of knowledge. Example: assessing the breadth of a comprehensive final exam |
| Criterion | Use this type of validity estimate
whenever you are comparing a new test with an established standard. Example: developing a test to predict genius (predictive), or developing a new test comparable to the CAHSEE (high school exit exam) |
| Construct | Use this type of validity estimate
whenever you are comparing a test with the elements of a theoretical
definition of a trait. Example: developing a new test for musical intelligence - distinguishing between musical and other types of intelligence |

