Automated STR data analysis: Validation studies

Automated STR Data Analysis: Validation Studies

M.W. Perlin, D. Coffman, C.A. Crouse, F. Konotop and J.D. Ban, "Automated STR data analysis: Validation studies", Promega's Twelfth International Symposium on Human Identification, Biloxi, MS, 11-Oct-2001.

Downloads

PowerPoint presentation and handout for the International Symposium on Human Identification 2001 talk.

Download Handout
Download PowerPoint
Download Conference Paper

Abstract

STR technology has enabled the rapid generation of highly informative DNA data for use in human identification. However, these data must be carefully analyzed. With database samples, there is now an acute shortage of skilled data reviewers. With casework samples (including mixtures), much information is not extracted from the data, despite considerable examiner effort. We are rapidly developing novel computational, mathematical and statistical methods that help overcome these limitations. This report focuses on the collaborative validation of these methods.

Convicted offender DNA databases must be accurate. To minimize error, the original STR data are carefully reviewed by two or more people. Moreover, in a troubleshooting capacity, this review helps to continuously maintain high quality lab data. But there are not enough skilled personnel for this arduous, repetitive task. To alleviate this critical labor shortage, we developed the TrueAllele^™ expert system. The computer program automates virtually every human review function, and provides consistent quality assessment and allele designation.

The TrueAllele validation starts with the original data from 30,000 CODIS samples. System parameters are adapted to the instruments (ABI/310, ABI/3700, Hitachi/FMbio) and panels (ProfilerPlus, Cofiler, Powerplex 1.1 and 2.1) used to generate the data. Computer processing is then done, with automated scoring of the high quality data, followed by limited human review. The computed expert system results are compared against previously manually scored results. We will report on the relative accuracy and efficiency of the automated approach.

In casework, DNA mixtures are analyzed to assess candidate suspects. When inferred profiles are matched against a convicted offender database, useful leads are generated. When matched against a known suspect, the mixture data can help convict or exonerate. However, data uncer- tainty leads to inherently complex and ambiguous analysis. We have developed a new technol- ogy, Linear Mixture Analysis (LMA), which uses multilocus quantitative data to automatically eliminate this complexity. LMA objectively resolves mixtures into candidate profiles, and provides highly informative statistical measures.

The LMA validation involves both synthetic mixtures and actual casework profiles derived from diverse panels and instruments. After quantitative peak analysis (using TrueAllele) on the original data, we apply LMA to automatically determine contributor profiles. Database search validation is done by assessing the error rates of matching these profiles against existing DNA databases. Casework validation is done by examining the LMA statistics relative to known suspect profiles. We will report on LMA's accuracy and informativeness.

Our presentation describes novel computer-based methods for assuring data quality, automating DNA database review, and analyzing the mixed DNA profiles found in casework. We will present the objective results of our ongoing validation studies, and demonstrate the feasibility of practical automated analysis. Our primary objective is the rapid introduction of validated intelli- gent data analysis systems for eliminating tedious human STR analysis. This contribution may help free up valuable DNA examiner time for serving justice through forensic science.

previous next