Causality

outlineback.gif (3663 bytes)

outlinebul.gif (984 bytes) Causality
  outlinebul.gif (984 bytes) Statistical Test
  outlinebul.gif (984 bytes) Temporal Ordering
  outlinebul.gif (984 bytes) Control for Competing Hypotheses
     

Inferring Causality

slbr4.gif (4051 bytes)

Hume argued that the inferences of a causal relationship between unobservables is never justified logically.

http://imgs.xkcd.com/comics/correlation.png

slbl3.gif (530 bytes) We can only establish causality with experimentation.
slbl3.gif (530 bytes) There are many times we want/need to have statistical evidence of a cause and effect relationship - and some times we just do not really care or need it.
  slbl3.gif (530 bytes) Example 1:  A person who takes medication A will have improved symptoms over a person who does not take medication A
  slbl3.gif (530 bytes) Example 2:  More time partying is negatively associated with higher grades. 
  slbl3.gif (530 bytes) Example 3:  People who own cats are more likely to drink tea than people who own dogs 
slbl3.gif (530 bytes) It is much harder to establish causality than to run a simple statistical test. So, if we do not need to establish a causal relationship, we probably do not want to take the extra time and resources (although sometimes it is not much more time, effort or money)
slbl3.gif (530 bytes)

Establishing Causation - refers to the condition of cause and effect

  slbl3.gif (530 bytes) Association does not demonstrate causation - this means that just because we can test statistically that when A changed then B changes, that does not mean that we know that the change in A caused the change in B
  slbl3.gif (530 bytes) We may not know if there were changes in an unobserved or unmeasured variable
  slbl3.gif (530 bytes) We may not know if there is a "spurious" association
slbl3.gif (530 bytes)

The statistical method we use does not determine cause and effect - the design of our data collection is much more important

slbl3.gif (530 bytes)

Conditions

  slbl3.gif (530 bytes) Statistical evidence of association or difference is necessary, but insufficient, to determine cause and effect
  slbl3.gif (530 bytes) Temporal ordering of variables is necessary
  slbl3.gif (530 bytes) Control of other causal factors is necessary
slbl3.gif (530 bytes) All three factors must be met in order to establish causality.
       

Statistical Test

slbr4.gif (4051 bytes)

A necessary, but insufficient, condition for establish causality is to have statistical evidence of an association or difference.

slbl3.gif (530 bytes)

It is impossible to establish a cause and effect relationship without having a statistical test showing some evidence of an effect

slbl3.gif (530 bytes)

Many different test can be used to satisfy this condition.

  slbl3.gif (530 bytes) Dependency tests of association (such as regression or ANOVA)
  slbl3.gif (530 bytes) Tests of association (Chi-Squared, Spearman or Pearson)
  slbl3.gif (530 bytes) Tests of differences (Test of two means, Test of two proportions)
slbl3.gif (530 bytes) Example
  slbl3.gif (530 bytes) Examples 1:  A person who takes medication A will have improved symptoms over a person who does not take medication A
  slbl3.gif (530 bytes)

  slbl3.gif (530 bytes) Example 2:  More time partying is negatively associated with higher grades. 
  slbl3.gif (530 bytes)

  slbl3.gif (530 bytes) Example 3:  People who own cats are more likely to drink tea than people who own dogs 
slbl3.gif (530 bytes)

       

Temporal Ordering

slbr4.gif (4051 bytes)

In order for A to have caused a change in B - then A has to come before B.  

slbl3.gif (530 bytes) If we are not sure what came first, we cannot establish a cause and effect relationship.
slbl3.gif (530 bytes) Ensuring temporal ordering must be part of the data collection design (experimental design).

slbl3.gif (530 bytes) In the Dilbert cartoon above, the incorrect conclusion could have been avoided if the team was aware of which came first - illness or purchase of products.
slbl3.gif (530 bytes) Examples - can we establish a cause and effect relationship?
slbl3.gif (530 bytes) Why the data that we discussed for each of the examples was likely to produce statistically significant tests of associations/differences - we have not yet established the conditions to declare cause and effect.
  slbl3.gif (530 bytes) Example 1:  A person who takes medication A will have improved symptoms over a person who does not take medication A
 

In this example, we had data that would indicated that the people who took the Rx had improved symptoms over the people who did not.  In the case of the tests of medical effectiveness, we can assure that we give the people the medicine before we take the measurements of improved health.  We can be certain that the medicines came before the improvement in symptoms.

  slbl3.gif (530 bytes) Example 2:  More time partying is negatively associated with higher grades. 
 

In this example, we had data that would indicated a negative association between the amount of time partying and the grades earned. Based on the manner in which we collected the data, we can assure that we collected the hours spent partying before we gave the graded assignment. We can be certain that the partying came before the lower grades - if we collect the data correctly.

  slbl3.gif (530 bytes) Example 3:  People who own cats are more likely to drink tea than people who own dogs 
In this example, we had data that would indicated that the people who owned cats were likely to drink more tea.  We have no way of knowing if the person drank tea before (s)he owned a cat or if the person owned the cat before starting to drink tea. While we may have evidence of an association, we cannot be certain of temporal ordering.
 

       

Competing Hypothesis

slbr4.gif (4051 bytes)

A Competing Hypothesis is an alternative reason for what is being observed. 

slbl3.gif (530 bytes)

Why else might this event have happened?

  slbl3.gif (530 bytes) Changes in an unobserved or unmeasured variable might be the cause or there may be no cause at all.  Just because it looks like a cause, does not mean it is!

  All of these bulbs have similar tops - it is easy to make a mistake.

  Many times an alternative unobserved or unmeasured variable caused both observed variables to change and the two observed variables are not related to each other. 
  slbl3.gif (530 bytes) Spurious association - is an observed correlation without any true relationship - (e.g. shorter skirts do not make the stock market rise) 
slbl3.gif (530 bytes)

(Source: hipstermusings)

slbl3.gif (530 bytes)

Randomly assigning respondents to treatment and control groups and careful experimental design allows us to control for competition hypotheses.

slbl3.gif (530 bytes) There is frequently difficulty being sure all possible competing hypotheses are effectively controlled.
slbl3.gif (530 bytes) Examples - can we establish a cause and effect relationship?
slbl3.gif (530 bytes) Why the data that we discussed for each of the examples was likely to produce statistically significant tests of associations/differences, and for two of the examples we could be certain of temporal ordering  - we have not yet established the conditions to declare cause and effect.
  slbl3.gif (530 bytes) Example 1:  A person who takes medication A will have improved symptoms over a person who does not take medication A
 

In this example, we had data that would indicated that the people who took the Rx had improved symptoms over the people who did not. We could control for temporal ordering - we knew that the person either did or did not take the Rx prior to the check for improve symptoms.  

Can we control for competing hypotheses?  If we randomly assigned respondents to the group that received Medication A and to a group that received traditional treatment, we can assume away competing hypotheses because we would expect that all other possible causes (e.g. some people heal faster, some people have natural immunity, some people are healthier otherwise, etc.) are equally present in both groups. Spurious relationship can also be assumed away by the randomization. Because we have a statistical test of a difference, controlled for temporal ordering and controlled for competing hypotheses, we can claim a cause and effect.
  slbl3.gif (530 bytes) Example 2:  More time partying is negatively associated with higher grades. 
 

In this example, we had data that would indicated a negative association between the amount of time partying and the grades earned and we could control for temporal ordering.  Can we control for competing hypotheses?  Typically, we cannot assign respondents randomly to party to varying degrees before an exam.

If we cannot randomly assigned respondents to the various levels of partying, we cannot assume away competing hypotheses.  There are other possible causes for any statistical relationship that we might observe:
  • People who party too much might also have a lower IQ, so IQ might be the cause of both partying and grades.
  • People who party too much might also live off campus, so living off campus might be the cause of both partying and grades.
  • Grades and partying might be spurious relationship.

So even though we have a statistical test of a difference, and we controlled for temporal ordering, we could not control for competing hypotheses, so we cannot claim a cause and effect.

       

slbr4.gif (4051 bytes)

Copyright Dr. Nancy D. Albers-Miller, All Rights Reserved