Machine Bias and Probublica

by Kenneth Enevoldsen | 2021-05-15

A Brief introduction

Propublica and COMPAS

Three fairness conditions

Well-calibrated, statistical parity

Balance for the positive class

Balance for the negative class

Statistical parity

A person in group $a$ and a person in group $b$ should have equal probability to be assigned to either class positive or negative class.

$$ P(R = + | A=a) = P(R = + | A = b) \quad \forall a, b\in A $$

Where $R$ is the predicted response variable.

Balance for positive class

The average score received by a positive case (unknown to the model) should be the same in each group.

$$ E(S |Y=+, A=a) = E(S |Y=+, A=b) \quad \forall a, b\in A $$

Where $S$ is the probability score, $Y$ is the actual response variable.

Balance for negative class

Similar to before:

$$ E(S |Y=-, A=a) = E(S |Y=-, A=b) \quad \forall a, b\in A $$

Where $S$ is the probability score, $Y$ is the actual response variable.

Theorem 1.1

“Consider an instance of the problem in which there is a risk assignment satisfying the [three fairness conditions]. Then the instance must either allow for perfect prediction […] or have equal base rates."

Theorem 1.2

The approximate version of 1.1.

approximately fair $\Rightarrow$ approximately perfect prediction or approximately equal base rate

Picking our poison 🦠🧪

One of the following must hold:

the test’s probability estimates are systematically skewed upward or downward for at least one gender

the test assigns a higher average risk estimate to healthy people, in one gender than the other

the test assigns a higher average risk estimate to carriers of the disease in one gender than the other

References