[go: up one dir, main page]

0% found this document useful (0 votes)
19 views1 page

Data Fallacies To Avoid

The document outlines various data fallacies to avoid, including cherry picking, data dredging, and survivorship bias, which can lead to misleading conclusions. It emphasizes the importance of using representative data and being cautious of biases such as false causality and sampling bias. Additionally, it highlights the dangers of overfitting models and relying solely on summary metrics, which can obscure significant insights.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views1 page

Data Fallacies To Avoid

The document outlines various data fallacies to avoid, including cherry picking, data dredging, and survivorship bias, which can lead to misleading conclusions. It emphasizes the importance of using representative data and being cautious of biases such as false causality and sampling bias. Additionally, it highlights the dangers of overfitting models and relying solely on summary metrics, which can obscure significant insights.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

DATA FALLACIES TO AVOID

CHERRY PICKING DATA DREDGING SURVIVORSHIP BIAS


Selecting results that fit your claim and excluding Repeatedly testing new hypotheses against the same Drawing conclusions from an incomplete set of data,
those that don’t. set of data, failing to acknowledge that most because that data has ‘survived’ some selection criteria.
correlations will be the result of chance.

COBRA EFFECT FALSE CAUSALITY GERRYMANDERING


Setting an incentive that accidentally produces the Falsely assuming when two events appear related Manipulating the geographical boundaries used to
opposite result to the one intended. Also known as a that one must have caused the other. group data in order to change the result.
Perverse Incentive.

SAMPLING BIAS GAMBLER’S FALLACY HAWTHORNE EFFECT


Drawing conclusions from a set of data that isn’t Mistakenly believing that because something has The act of monitoring someone can affect their
representative of the population you’re trying to understand. happened more frequently than usual, it’s now less behaviour, leading to spurious findings. Also known as
likely to happen in future (and vice versa). the Observer Effect.

REGRESSION TOWARDS THE MEAN SIMPSON’S PARADOX MCNAMARA FALLACY


When something happens that’s unusually good or When a trend appears in different subsets of data but Relying solely on metrics in complex situations and
bad, it will revert back towards the average over time. disappears or reverses when the groups are combined. losing sight of the bigger picture.

OVERFITTING PUBLICATION BIAS DANGER OF SUMMARY METRICS


Creating a model that’s overly tailored to the data you Interesting research findings are more likely to be Only looking at summary metrics and missing big
have and not representative of the general trend. published, distorting our impression of reality. differences in the raw data.

Read more at data-literacy.geckoboard.com

You might also like