Pennsylvania prosecutors use TrueAllele in homicide guilty plea

Back to Publications

Simple Reporting of Complex DNA Evidence: Automated Computer Interpretation

Perlin, M.W. Simple reporting of complex DNA evidence: automated computer interpretation in the Proceedings of Promega's Fourteenth International Symposium on Human Identification. Phoenix, AZ, 2003.


Downloads

Article


Abstract

Complex DNA evidence (e.g., mixed, degraded, or small amounts of DNA) can confound straightforward approaches to interpretation and reporting. Reporting simplicity here has four components: (1) rapid turnaround time to the police or prosecutors who need DNA answers, (2) high information content that provides a useful discriminating power (DP), (3) understandable results that can be presented to the layperson, and (4) admissibility in court that reflects the underlying reliability of the scientific methods employed.

We have developed a fully automated expert computer system that interprets complex DNA evidence based on mathematical models of the STR process, and a hyper-modern statistical assessment of certainty. After taking several minutes to consider thousands of numerical variables, the computer can present the results of its rigorous deliberations as ordinary probabilities that are understandable to juries. This paper describes the collaborative study design, data generation, analytic interpretation, and scientific results obtained in validating our expert system.

The validation study was designed collaboratively with crime labs in Florida, Maryland, and Virginia. The "mock rape kit" approach analyzed a set of two contributor "sperm fraction" DNA mixtures (10%, 30%, 50%, 70% and 90% proportions; at 1.0, 0.5, 0.25 and 0.125 ng dilutions), along with their "victim" reference samples. NIST prepared the mixed DNA stock solutions in known proportions for two individual pairs, and sent these materials to the ten participating forensic laboratories. Each lab followed detailed study protocols, generated PCR data for 56 templates, and sent their original sequencer data to Cybergenetics for automated computer interpretation. In addition, each lab also provided about 100 single-source samples that were used for calibrating their PCR process and artifacts (stutter, preferential amplification, peak variation, etc.). All STR chemistries and DNA sequencers in current forensic use were represented.

Cybergenetics ran their sequencer-independent TrueAllele® program on the original lab data to generate a database of quality checked, quantitated peaks (under five minutes of human time per gel). The interpretation expert system was then applied to these data (no human time). The calibration produced graphs of stutter and pref amp for each marker. The automated mixture analysis for each lab's 40 unknown suspect cases (mixture and reference data, but no suspect data) yielded an estimate of the mixing proportion, the genotype confidence set at each locus, and quantitative bar graphs that permitted rapid visual comparison between the observed data and the best genotype model. These computer results were reviewed by each laboratory.

The DNA templates and the computer interpretation were the same throughout the study, so we could compare the amount of information present in each laboratory's data. Our key information measure was discriminating power: the probability that a randomly selected person from the population matches the inferred unknown suspect profile. Unlike conventional measures (e.g., CPE, likelihood ratios), our computed DP is an understandable probability result that reflects the computer's detailed consideration of all feasible genotypes, and captures all relevant information. We (1) found that DP is an accurate measure of laboratory data quality, (2) established a quantitative relationship between mixture proportion and template dilution relative to information content, and (3) showed how the computer could combine multiple mixture samples at low DNA concentration without any reference sample to derive very high unknown suspect DP.

The legal admissibility of our scientific approach was established by the reliability demonstrated in our collaborative validation study. All points of Daubert were addressed: testability, error rate, peer review, and general acceptance. These admissibility issues are detailed in this paper. We have validated an expert computer approach to the interpretation of complex DNA evidence that is objective, unbiased, reproducible, reliable and admissible. Moreover, the system generates simple and understandable results that are useful in court.