Abstract

A number of findings in the field of machine learning have given rise to questions about what it means for automated scoring- or decision-making systems to be fair. One center of gravity in this discussion is whether such systems ought to satisfy classification parity (which requires parity in predictive performance across groups, defined by protected attributes) or calibration (which requires similar predictions to have similar meanings across groups, defined by protected attributes). Central to this discussion are impossibility results, which show that classification parity and calibration are often incompatible. This paper aims to argue that classification parity, calibration, and a newer, interesting measure called counterfactual fairness are unsatisfactory measures of fairness, offer a general diagnosis of the failure of these measures, and sketch an alternative approach to understanding fairness in machine learning.

You do not currently have access to this content.