Correlation != Causation; However, It is Often All We Have

> correlation doesn't mean causation.

As a statistician, I guess I should be happy that more people are aware of this. But I also think too many people are taking "correlation != causation" superficially. I mean, almost all of science is based on significant correlational findings, especially when the traditional way to prove causation (i.e. via randomized trial) is unethical (i.e. we can't randomly assign people to be insured vs. uninsured).

Along these lines, I often find people who say "correlation != causation" don't stop and wonder "so how _can_ we prove causation (in a non-randomized study)?" I guess many of them can be partially excused since the answer is non-trivial. But generally, here's a few rules of thumb for making a stronger case for causality from correlation:

* the effect size is relatively large (e.g. uninsured children die at 60% higher odds than insured children)

* the cause comes before the effect (e.g. people are uninsured before they go to the hospital and/or die)

* reversible association (e.g. risk of dying at a hospital changes when people get insurance)

* consistency / consensus across multiple studies (e.g. many studies showing that a difference in insurance status is associated with a significant difference in hospital mortality )

* dose-response relationship (e.g. I didn't link examples previously -- but there were a few studies showing that different levels of insurance, from none to Medicaid to private, is associated with different rates of hospital mortality)

* plausibility (e.g. even from a qualitative point of view, it's quite believable that people who unable to pay a hospital bill might get worse service)

Notes:

Examples of when correlation should be taken seriously.

Folksonomies: statistics causation correlation

Taxonomies:
/health and fitness/disease/autism and pdd (0.483333)
/family and parenting/children (0.457105)
/science/mathematics/statistics (0.427861)

Keywords:
causation (0.907308 (negative:-0.359955)), correlation (0.865844 (negative:-0.296211)), significant correlational findings (0.720298 (neutral:0.000000)), people (0.609722 (negative:-0.264668)), (e.g. risk of dying at a hospital changes when people get insurance) (0.608384 (neutral:0.000000)), hospital mortality (0.585591 (negative:-0.484713)), uninsured children (0.407378 (negative:-0.366461)), randomized trial (0.401562 (negative:-0.389253)), traditional way (0.383856 (negative:-0.334207)), stronger case (0.376376 (neutral:0.000000)), non-randomized study (0.373386 (neutral:0.000000)), dose-response relationship (0.367340 (negative:-0.226681)), consistency / consensus (0.361790 (positive:0.490803)), higher odds (0.358702 (negative:-0.366461)), worse service (0.356547 (negative:-0.503197)), insured children (0.355112 (negative:-0.366461)), qualitative point (0.350003 (neutral:0.000000)), significant difference (0.337454 (neutral:0.000000)), different levels (0.334793 (negative:-0.206235)), different rates (0.332940 (negative:-0.484713)), hospital changes (0.324981 (negative:-0.553852)), multiple studies (0.322412 (positive:0.490803)), insurance status (0.321125 (neutral:0.000000))

Entities:
Medicaid:Organization (0.709245 (negative:-0.544135)), 60%:Quantity (0.709245 (neutral:0.000000))

Concepts:
Insurance (0.945894): dbpedia | freebase | opencyc
Causality (0.869794): dbpedia | freebase
Statistical significance (0.867679): dbpedia | freebase
Actuarial science (0.764723): dbpedia | freebase
Free will (0.752162): dbpedia | freebase
Effect size (0.693002): dbpedia | freebase | yago
Randomness (0.690198): dbpedia | freebase
Medical statistics (0.686615): dbpedia | freebase

 Correlation and Causation
Electronic/World Wide Web>Message Posted to Online Forum/Discussion Group:  vwinsyee, (02/29/2014), Correlation and Causation, Retrieved on 2014-03-03
  • Source Material [news.ycombinator.com]
  • Folksonomies: statistics