This study analyzes the reliability of approximately 800,000 college grades from three higher educational institutions that vary in type and size. Comparisons of intraclass correlation coefficients (ICCs) reveal patterns among institutions and academic disciplines. Results from this study suggest that there are styles of grading associated with academic disciplines. Individual grade assignment ICC is comparable to rubric-derived learning assessments at one institution, and both are arguably too low to be used for decision making at that level. A reliability lift calculation suggests that grade averages over eight (or so) courses per student have enough reliability to be used as outcome measures. We discuss how grade statistics can complement efforts to assess program fairness, rigor, and comparability, as well as assessing the complexity of a curriculum. The R code and statistical notes are included to facilitate use by assessment and institutional research offices.

You do not currently have access to this content.