Explaining and Checking Fairness

The document discusses sources of bias in machine learning algorithms and predictive models. It notes that bias can come from historical biases encoded in the data, a lack of representative data used to train models, issues measuring variables of interest, evaluating models on non-representative populations, and using proxy variables correlated with protected attributes. Addressing bias requires understanding where in the data and model building process bias may emerge.

Uploaded by

przemyslaw.biecek

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views63 pages

Explaining and Checking Fairness

Uploaded by

przemyslaw.biecek

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 63

Explaining and Checking Fairness

for Predictive Models

Przemysław Biecek
3rd Workshop
eXplaining Knowledge Discovery in Data Mining 2021

Materials: h ps://tinyurl.com/xkdd-fairness

Slack: h ps://tinyurl.com/xkdd21

https://github.com/ModelOriented/fairmodels
tt
tt
Do algorithms
discriminate?
https://www.propublica.org/article/facebook-ads-can-still-discriminate-against-women-and-older-workers-despite-a-civil-rights-settlement
Racist Soap Dispenser

https://twitter.com/nke_ise/status/897756900753891328
Cathy O'Neil:
The era of blind faith
black boxes
in big data must end

• “You don’t see a lot of skepticism,” she says. “The algorithms are like shiny new toys
that we can’t resist using. We trust them so much that we project meaning on to them.
• Ultimately algorithms, according to O’Neil, reinforce discrimination and widen
inequality, “using people’s fear and trust of mathematics to prevent them from asking
questions”

https://www.theguardian.com/books/2016/oct/27/cathy-oneil-weapons-of-math-
destruction-algorithms-big-data
.

What does it mean to discriminate?

https://fra.europa.eu/sites/default/ les/fra_uploads/fra-2018-handbook-non-
discrimination-law-2018_en.pdf

https://fra.europa.eu/en/publication/2018/handbook-european-non-discrimination-law-2018-edition
fi
PROTECTED GROUNDS
- Sex
- Gender identity
- Sexual orientation
- Disability
- Age
- Race, ethnicity, colour and membership of
a national minority
- Nationality or national origin
- Religion or belief
- Social origin, birth and property
- Language
- Political or other opinion

https://fra.europa.eu/sites/default/ les/fra_uploads/fra-2018-handbook-non-
discrimination-law-2018_en.pdf

https://fra.europa.eu/en/publication/2018/handbook-european-non-discrimination-law-2018-edition
fi
Moritz Hardt 2020, Fairness and Machine Learning (MLSS)
https://www.youtube.com/watch?v=Igq_S_7IfOU
Is di erent treatment always
a discrimination?
ff
Sex and gender di erences and biases in arti cial intelligence for biomedicine and healthcare
Cirillo et al 2020

https://www.nature.com/articles/s41746-020-0288-5
ff
fi
Sex and gender di erences and biases in arti cial intelligence for biomedicine and healthcare
Cirillo et al 2020

https://www.nature.com/articles/s41746-020-0288-5
ff
fi
Think about the whole process
bias may be everywhere
Some sources of bias
• Historical bias. The data are correctly sampled and correspond well to the observed relationships, but due
to di erent treatment in the past some prejudices are encoded in the data. Think about gender and
occupation stereotypes.

• Representation bias. The available data is not a representative sample of the population of interest. Think
about the available facial images of actors, often white men. Or genetic sequences of covid variants, mostly
collected in developed European countries. Or crime statistics in the regions to which the police are
directed.

• Measurement bias. The variable of interest is not directly observable or is di cult to measure and the way
it is measured may be distorted by other factors. Think of the results of the mathematics skills assessment
(e.g. PISA) measured by tasks on computers not that widely available in some countries.

• Evaluation bias. The evaluation of the algorithm is performed on a population that does not represent all
groups. Think of a lung screening algorithm tested primarily on a population of smokers (older men).