Part 1: Why Association is Not Causation
Introduction to Causal Inference
This section explores the fundamental motivation for Mendelian Randomization (MR): the critical difference between association and causation. In observational studies, an observed correlation between an exposure and an outcome does not necessarily mean the exposure causes the outcome. This is because such associations can be influenced by several types of bias.
Common Biases in Observational Studies
- Confounding: A third factor, a confounder, may be associated with both the exposure and the outcome, creating a spurious link between them.
- Reverse Causation: The direction of causality may be opposite to what is assumed; the outcome may actually be causing the exposure.
- Measurement Error: Imprecise measurement of exposures can distort associations, often weakening them and making true effects harder to detect.
The Gold Standard and Its Limitations
Randomized Controlled Trials (RCTs) are the gold standard for establishing causality, as random allocation of the exposure minimizes bias. However, RCTs are not always practical due to high costs, long durations, ethical constraints, or issues with generalizability.
Mendelian Randomization: Nature’s RCT
Mendelian Randomization (MR) offers a way to strengthen causal inference from observational data. It uses genetic variants that are robustly associated with an exposure as “instrumental variables.” Since these genetic variants are randomly allocated at conception, MR can be thought of as a natural randomized trial.
The Three Core Assumptions of MR
For an MR study to be valid, three core assumptions must hold:
- Relevance: The genetic variant must be strongly associated with the exposure of interest.
- Independence: The genetic variant must not be associated with any confounders of the exposure-outcome relationship.
- Exclusion Restriction: The genetic variant must affect the outcome only through the exposure, with no alternative causal pathways (a bias known as horizontal pleiotropy).
This framework allows us to test for causal relationships in a more robust way. For a foundational understanding, we recommend the seminal paper by Davey Smith and Ebrahim (2003) and for practical implementation, the guidelines by Burgess et al. (2020).