Back to Publications
Explaining the Likelihood Ratio in DNA Mixture Interpretation
Perlin, M.W. Explaining the likelihood ratio in DNA mixture interpretation in the Proceedings of Promega's Twenty First International Symposium on Human Identification. San Antonio, TX, 2010.
Proceedings of Promega's Twenty First International Symposium on Human Identification
Abstract: In DNA identification science, the likelihood ratio (LR) assesses the evidential support for the identification hypothesis that a suspect contributed their DNA to the biological evidence. The LR summarizes the sensitivity and specificity of a statistical test. The LR logarithm is a standard information measure for stating the support for a simple hypothesis (i.e., a single assertion relative to its logical alternative).
After Alan Turing's LR methods cracked the German Enigma code during World War II, LR usage became widespread. The LR is ubiquitous in the physical, biological, social, economic, computer and forensic sciences. First introduced into biological identification through paternity testing, the LR enjoys unparalleled international usage as the most informative DNA mixture statistic.
Yet American crime labs avoid the LR, and prefer to report DNA inclusion statistics that they find easier to explain in court. Such "inclusion" methods (variously termed PI, CPI, CPE or RMNE) use less of the DNA data, typically discarding a million-fold factor of identification information. Thus highly informative DNA mixture evidence can be reported as "inconclusive" or assigned an unrealistically low match score. Unfortunately, minimizing DNA evidence leads to a failure to identify criminals, with an adverse effect on public safety.
To make the LR more acceptable to American analysts and their juries, we need more intuitive ways to explain the LR. Fortunately, the LR can be expressed (by Bayes theorem) in several equivalent ways. Stated in plain English, these alternative formulations include:
- the information gain in the identification hypothesis from the DNA data,
- how well the identification hypothesis explains the data, relative to its alternative, and
- our increased belief in a match to a suspect, based on the inferred evidence genotype.
The second LR formulation prevails in forensic DNA. While natural for computers and statisticians, non-mathematicians often find its formulas opaque. In this paper, we describe the other two formulations as intuitive ways to explain the LR simply and accurately. Moreover, these other approaches avoid the dread "transposed conditional." Using DNA case examples, we show how to easily understand the LR, present it in court, and deflect superficial challenges.
For the American public to benefit from the full protective power of DNA identification information, analysts must be able to confidently explain the LR. This paper shows them how.