|
http://imgs.xkcd.com/comics/correlation.png |
 |
We
can only establish causality with experimentation. |
 |
There
are many times we want/need to have statistical evidence
of a cause and effect relationship - and some times we
just do not really care or need it. |
| |
 |
Example
1: A person who takes medication A will have
improved symptoms over a person who does not take
medication A |
| |
 |
Example
2: More time partying is negatively associated with
higher grades. |
| |
 |
Example
3: People who own cats are more likely to drink tea
than people who own dogs |
 |
It
is much harder to establish causality than to run a simple
statistical test. So, if we do not need to establish a
causal relationship, we probably do not want to take the
extra time and resources (although sometimes it is not
much more time, effort or money) |
 |
Establishing Causation -
refers to the condition of cause and effect |
| |
 |
Association
does not demonstrate causation - this means that just
because we can test statistically that when A changed then
B changes, that does not mean that we know that the change
in A caused the change in B |
| |
 |
We
may not know if there were changes in an unobserved or
unmeasured variable |
| |
 |
We
may not know if there is a "spurious"
association |
 |
The statistical method we
use does not determine cause and effect - the design of
our data collection is much more important |
 |
Conditions |
| |
 |
Statistical
evidence of association or difference is necessary, but
insufficient, to determine cause and effect |
| |
 |
Temporal
ordering of variables is necessary |
| |
 |
Control
of other causal factors is necessary |
 |
All
three factors must be met in order to establish causality. |
| |
|
|
|
|
Statistical
Test

A necessary, but
insufficient, condition for establish causality is
to have statistical evidence of an association or
difference.
|
 |
 |
It is
impossible to establish a cause and effect
relationship without having a statistical test
showing some evidence of an effect |
 |
Many different
test can be used to satisfy this condition. |
| |
 |
Dependency
tests of association (such as regression or
ANOVA) |
| |
 |
Tests
of association (Chi-Squared, Spearman or
Pearson) |
| |
 |
Tests
of differences (Test of two means, Test of two
proportions) |
 |
Example |
| |
 |
Examples
1: A person who takes medication A will
have improved symptoms over a person who does
not take medication A |
| |
 |

|
| |
 |
Example
2: More time partying is negatively
associated with higher grades. |
| |
 |

|
| |
 |
Example
3: People who own cats are more likely
to drink tea than people who own dogs |
 |

|
| |
|
|
|
|
Temporal
Ordering

In
order for A to have caused a change in B
- then A has to come before B.
|

|
 |
If
we are not sure what came first,
we cannot establish a cause and
effect relationship. |
 |
Ensuring
temporal ordering must be part of
the data collection design
(experimental design). |
|

|
 |
In
the Dilbert cartoon above, the
incorrect conclusion could have
been avoided if the team was aware
of which came first - illness or
purchase of products. |
|
|
| |
|
|
|
|
|
Competing
Hypothesis

A
Competing Hypothesis is an
alternative reason for what
is being observed.
 |
Why
else might this event
have happened? |
| |
 |
Changes
in an unobserved or
unmeasured variable
might be the cause or
there may be no cause
at all. Just
because it looks like
a cause, does not mean
it is! |
|
|

|
| |
|
All
of these bulbs have
similar tops - it is
easy to make a
mistake. |
|
|

|
| |
|
Many
times an alternative unobserved
or unmeasured variable
caused both observed
variables to change
and the two observed
variables are not
related to each other. |
| |
 |
Spurious
association - is an
observed correlation
without any true
relationship - (e.g.
shorter skirts do not
make the stock market
rise) |
 |
(Source:
hipstermusings) |
 |
Randomly
assigning respondents
to treatment and
control groups and
careful experimental
design allows us to
control for
competition
hypotheses. |
 |
There
is frequently
difficulty being sure
all possible competing
hypotheses are
effectively
controlled. |
|
 |
Examples
- can we
establish a
cause and effect
relationship? |
|
 |
Why
the data that we
discussed for
each of the
examples was
likely to
produce
statistically
significant
tests of
associations/differences,
and for two of
the examples we
could be certain
of temporal
ordering -
we have not yet
established the
conditions to
declare cause
and effect. |
| |
 |
Example
1: A
person who takes
medication A
will have
improved
symptoms over a
person who does
not take
medication A |
| |
|
In
this example, we
had data that
would indicated
that the people
who took the Rx
had improved
symptoms over
the people who
did not. We
could control
for temporal
ordering - we
knew that the
person either
did or did not
take the Rx
prior to the
check for
improve
symptoms. |
 |
|
|
Can
we control for
competing
hypotheses?
If we randomly
assigned
respondents to
the group that
received
Medication A and
to a group that
received
traditional
treatment, we
can assume away
competing
hypotheses
because we would
expect that all
other possible
causes (e.g.
some people heal
faster, some
people have
natural
immunity, some
people are
healthier
otherwise, etc.)
are equally
present in both
groups. Spurious
relationship can
also be assumed
away by the
randomization.
Because we have
a statistical
test of a
difference,
controlled for
temporal
ordering and
controlled for
competing
hypotheses, we
can claim a
cause and
effect. |
| |
 |
Example
2: More
time partying is
negatively
associated with
higher grades. |
| |
|
In
this example, we
had data that
would indicated
a negative
association
between the
amount of time
partying and the
grades earned
and we could
control for
temporal
ordering.
Can we control
for competing
hypotheses?
Typically, we
cannot assign
respondents
randomly to
party to varying
degrees before
an exam. |
 |
|
|
If
we cannot
randomly
assigned
respondents to
the various
levels of
partying, we
cannot assume
away competing
hypotheses.
There are other
possible causes
for any
statistical
relationship
that we might
observe:
- People
who party
too much
might also
have a lower
IQ, so IQ
might be the
cause of
both
partying and
grades.
- People
who party
too much
might also
live off
campus, so
living off
campus might
be the cause
of both
partying and
grades.
- Grades
and partying
might be
spurious
relationship.
So
even though we
have a
statistical test
of a difference,
and we
controlled for
temporal
ordering, we
could not
control for
competing
hypotheses, so
we cannot claim
a cause and
effect. |
|
|
|
|
|
|
| |
|
|
|
|

Copyright
Dr. Nancy D. Albers-Miller,
All Rights Reserved |
|
|
|
|
|