UNITED STATES of America
v.
Paul E. LOWE, Defendant.
Criminal No. 95-10404-PBS.
United States District Court, D. Massachusetts.
December 6, 1996. As Corrected January 16, 1997.*402 Peter Parker, Federal Defender's Office, Boston, MA, for defendant Paul E. Lowe.
Paula J. DeGiacomo, Despena F. Billings, United States Attorney's Office, Boston, MA, for U.S.
MEMORANDUM AND ORDER
SARIS, District Judge.
I. Introduction
This memorandum addresses a challenge to the admissibility of DNA profiling evidence in a criminal trial. On November 8, 1996, Defendant Paul Lowe was convicted of carjacking, in violation of 18 U.S.C. § 2119, kidnapping, in violation of 18 U.S.C. § 1201(a), and forcible transportation of another for criminal sexual activity, in violation of 18 U.S.C. § 2422(a). Prior to trial, pursuant to Fed.R.Evid. 702, 901 and 403, defendant Lowe had filed a motion to exclude evidence that his DNA profile matches the DNA samples in the rape kit of the alleged victim, her clothing, and in her car. Lowe raised essentially three challenges to the government's proffered evidence.
First, Lowe claimed that a new protocol employed by the Federal Bureau of Investigation ("FBI") in generating DNA profiles from forensic samples in this case via a method known as Restriction Fragment Length Polymorphism ("RFLP") analysis is not sufficiently reliable to be admissible. Chemiluminescence along with three other changes to the FBI's RFLP protocol were introduced in October of 1995, and were officially adopted in June of 1996. The other changes challenged here are (1) the elimination of ethidium bromide, (2) use of longer gels, and (3) use of new sizing ladders. Specifically, he alleged that the FBI's adoption of chemiluminescent probes and these other changes in the RFLP typing procedure has yet to undergo adequate scientific validation, rigorous peer review, or acceptance by the relevant scientific community. He also challenged the FBI's practice of comparing DNA profiles generated by the chemiluminescent protocol with databases generated under the old protocol, which used radioactive isotopes in determining population frequency.
Second, Lowe challenged the admissibility of DNA profiles obtained with a typing procedure known as polymerase chain reaction ("PCR") analysis,[1] on grounds that two of three PCR tests conductedthe Polymarker and D1S80have not been adequately validated through scientific testing and the peer review process, and are not generally accepted.
Third, with respect to RFLP and PCR, he further claimed the reliability of the FBI's results cannot be ascertained because the FBI's DNA Unit fails to compile, or make any effort to calculate, laboratory error rates, and because of inadequate proficiency testing.
After due consideration for each of the factors flagged in Daubert v. Merrell Dow Pharmaceuticals, Inc., 509 U.S. 579, 113 S.Ct. 2786, 125 L.Ed.2d 469 (1993), governing the admissibility of scientific evidence, and *403 after four days of extensive evidentiary hearings, on July 30 and September 6, 13 and 17, 1996 the Court DENIED Lowe's motion and sets forth the findings of fact and conclusions of law below.
II. Background
A. The Crime
At trial, the government introduced evidence of the following. Lowe assisted a young woman ("K.") whose car was stuck in a snowbank at the end of her driveway after a snowstorm, in Lowell, Massachusetts. He then forced her to let him in, and drove her to New Hampshire where, in the front passenger seat of the car, he forced her to perform oral sex on him and raped her vaginally. He then drove her in her car back to Lowell, Massachusetts, where he left her, after stealing her jewelry.
The primary defense was consent. Upon arrest, defendant admitted to the police to having sexual intercourse with the woman in an apartment, but insisted that she consented.[2] At trial, defendant claimed K. had a motive to fabricate that her consensual sexual conduct was a rape because of an abusive relationship with her boyfriend.
B. The Forensic Samples
K. told the government she drove to her boyfriend's house and was taken to the hospital later that day, where a rape kit was administered and blood and hair samples taken. Her clothing, an overcoat, and a towel she had used at her boyfriend's house to wipe her vaginal area were collected for forensic examination and analysis. Hair and other biological samples were also collected from both the passenger and driver sides of her automobile. In particular, a swab of human fluid was taken from inside the driver's side window. Upon Lowe's arrest, the blue jeans he was wearing were seized. By order of this Court, Lowe provided blood and hair samples to the government as well as items of his clothing. The FBI's DNA Analysis Unit subjected the forensic samples, along with the known biological samples provided by K. and Lowe, to either RFLP or PCR DNA analysis, and concluded that Lowe's DNA profile is "consistent" with certain of the forensic samples. Corresponding probabilities that this could be of pure coincidence were calculated.[3] The RFLP analysis yielded a match probability of 1 in 11 billion for the Caucasian population. The PCR analysis yielded a figure of one in 810,000 for the same population. Lowe does not dispute his classification as Caucasian, and does not present this Court with any evidence that he belongs to a sub-population (i.e., Swedish).
C. The hearing
During the evidentiary hearing, the Government relied on the testimony of Alan M. Giusti, a forensic examiner at the FBI's DNA Analysis Unit, Scientific Analysis Section ("DNA Unit"), who performed the DNA analyses in this case, and Dr. Martin L. Tracey, a biologist with an expertise in DNA and population genetics. The defendant relied on testimony of Dr. Dan E. Krane, an assistant professor in the Department of Biological Sciences at Wright State University in Ohio. Cited extensively is a Report of the National Research Council, The Evaluation of Forensic DNA Evidence 0-3 (1996) (prepublication copy) [hereinafter "1996 NRC Rep."] which both the government and the defendant agree is an authoritative work in the field.
III. Scientific Background For DNA Profiling
A. A Primer
Human cells contain a nucleus. Within each nucleus there are 46 chromosomes, 23 inherited from each parent. Each chromosome is constructed of deoxyribonucleic acid, or DNA, which "contains the coded information that provides the genetic blueprint" for each individual. United States v. Jakobetz, 955 F.2d 786, 791 (2d Cir.), cert. denied, 506 U.S. 834, 113 S.Ct. 104, 121 L.Ed.2d 63 (1992). Within each human being, the "cells *404 from various tissues, such as blood, hair, skin, and semen, have the same DNA content and therefore provide the same forensic information." 1996 NRC Rep. at 0-3.[4] With the exception of identical twins, the precise configuration of DNA differs from person to person, however, resulting in the physical uniqueness of each individual and the usefulness of DNA typing for the purposes of identification. Id.
The physical structure of DNA resembles "a twisted rope ladder with stiff wooden steps." Id. at 2-2. The sides of the ladder are composed of phosphate and sugar molecules, while each rung is composed of a pair of organic compounds called bases. There are four kinds of bases: adenine ("A"), guanine ("G"), cytosine ("C") and thymine ("T"). Id. Because of their chemical composition, T always pairs with A, and G with C. Id. This strict pairing rule means that the order, or sequence, of bases on one side of the DNA ladder will determine the sequence of the other side. Id.; Jakobetz, 955 F.2d at 791. See Appendix A.
The entire order, or sequence, of base pairs observed in the DNA ladder of a particular individual is the genetic code of that individual. 1996 NRC Rep. at O-4. [Giusti Aff., ¶ 24] A gene is a portion of DNA comprised of anywhere from a few thousand to tens of thousands of base pairs, the specific sequence of which serves as an encoded formula for producing the various proteins that make up the human body, and the particular features of individual human beings, such as blood type, eye color, etc. Id.
B. Variable Number of Tandem Repeats ("VNTR's")
Other fragments of DNA have no known function, but display varying numbers of repeats of a recognizable core sequence of base pairs, known as Variable Numbers of Tandem Repeats ("VNTRs"). Id. at O-6. The position that a gene or other DNA fragment occupies on the DNA ladder is called its "locus." Id. at O-4. At a given locus, sequences that vary in the number of repeats from individual to individual are known as "alleles." Id. at 2-6. Certain loci are particularly useful for forensic analysis because they are highly polymorphic, that is, they have a very large number of alleles, which can be identified by both their core sequence, or distinct order of As, Ts, Cs, and Gs, and their length, or exact number of repeats of each sequence. Id. at O-6. [Giusti Aff. ¶ 27].
When several polymorphic loci are analyzed at once, the possible genetic variability becomes enormous, id., reducing the likelihood that two different DNA samples so analyzed would "match" were the donor of each not the same individual. Mr. Giusti compares VNTRs to boxcars on a train. For example, at a particular locus a sequence such as GTGAGCTT-TTAGTAAAG may repeat itself 70 times or as many as 450 times depending on the individual. Giusti Aff. ¶ 36. Because the number of VNTRs varies from individual to individual, the length of the polymorphic fragments varies commensurately. RFLP analysis measures and compares the length of fragments.
C. Polymorphisms
There are two types of polymorphisms of interest in DNA profiling: length and sequence. Length polymorphisms, or VNTRs, as mentioned, are differences in the length of a particular allele, measured in terms of the number of repeats of a simple core sequence of base pairs. Sequence polymorphisms are differences in the actual sequence of base pairs between alleles found at a particular locus. Id. ¶ 27.
The FBI conducts two different types of tests on DNA that compare biological samples from a known source and forensic samples to determine the likelihood that the two samples originated from the same source. RFLP analysis is designed to detect and measure fragment length variations at a number of different polymorphic regions on the DNA molecule. See 1996 NRC Rep. at 2-9. See also Jakobetz, 955 F.2d at 792.
The PCR procedure is used to analyze bits of DNA too small to be suitable for RFLP analysis, and is generally used to detect and *405 compare specific base-pair sequences. To do this, the PCR protocol requires that the DNA segments be replicated through a process of separation of the DNA strands, and isolation and amplification of the target sequence. See 1996 NRC Rep. at 2-11.
The D1S80 test is a hybrid of the PCR and RFLP methodologies. It detects fragment length polymorphisms in much the same way as RFLP analysis, once the target DNA fragment has been isolated and amplified through the PCR procedure.
PCR, like RFLP, cannot make an absolute identification, but instead generates either an absolute exclusion or an indication that the samples tested are "consistent" with each other, which then requires a further calculation to estimate the probability that a random match would obtain if the samples were not from the same individual. [Giusti Aff. ¶ 153-155.] Both processes are described in greater detail below.
D. RFLP Analysis
RFLP analysis is a widely-accepted and scientifically validated method of forensic DNA testing, which has never been rejected as unreliable in any state or federal court. See 1996 NRC Rep. at 6-9 ("To the best of our knowledge, no state or federal court has held that VNTR profiling [i.e., RFLP analysis] is inadmissible on the grounds that it is not scientifically accepted or sound."). A brief description of the RFLP process (also referred to as VNTR typing) follows. See also Jakobetz, 955 F.2d at 792-93; United States v. Bonds, 12 F.3d 540, 550-51 (6th Cir.1993); 1996 NRC Rep. at 2-6 to 2-11.
First, DNA is extracted from the biological source material and placed in a solution. The DNA solution is then tested to determine whether enough DNA is present in the sample for RFLP analysis to proceed. 1996 NRC Rep. at 2-7.
Second, a restriction enzyme is applied to a biological sample containing DNA which chemically cuts the DNA into small fragments for analysis. Id. The restriction enzyme is designed to recognize a particular DNA sequence and cut the strand at that point. 1996 NRC Rep. at 2-7. The FBI utilizes an enzyme known as Hae III, which recognizes the base sequence GGCC and cuts the DNA between the G and C wherever that sequence is found. Giusti Aff. ¶ 34.
Third, the resulting DNA fragments are placed in "lanes" on a chemical gel to which an electric current is applied. DNA has a net negative charge, and because opposite charges attract, a positive field is applied at one end of the gel. Giusti Aff. ¶ 38. Standard DNA fragments are also run in several lanes of the gel for the purpose of sizing the sample fragments, much like a ruler, and are typically referred to as "sizing ladders" or "markers." In this phase, called gel electrophoresis, the DNA fragments migrate through the gel at different speeds according to their size. When the current is stopped, the shorter fragments will have travelled through the gel farther than the longer fragments. At this point, the fragments are invisible. Id. See Appendix B.
Fourth, the DNA strands are chemically denatured, or "unzipped," separating the ladder of base pairs into two single strands. A process called Southern blotting is then used to transfer the single-stranded fragments from the gel to a nylon membrane, where they occupy the same position as they did on the gel. Id.
Fifth, during a process called hybridization, a single-stranded probe of manufactured DNA having a known sequence of base pairs is applied to the denatured DNA on the membrane. The probe will attach to the corresponding sequence of bases present in the denatured strand on the membrane (A to T & C to G) to recreate the double helix. The probe will attach to as many repeats of that particular sequence as are present in the samples on the membrane. (Giusti test., Tr-1, p. 25) Any remaining probe that does not attach to the target DNA sequence is chemically washed away, 1996 NRC Rep. at 2-7, so that the process can be repeated for different VNTR loci. G. Aff. ¶ 51.
Under the FBI's old protocol, the probes utilized for hybridization contained radioactive atoms. When the membranes are placed on a piece of x-ray film, emissions from the radioactive probe acted as a "flare" to expose the film at locations on the membrane where *406 the probe adhered to the target VNTR in the sample. Id. The resulting image, called an autorad, or autoradiographs, contains dark bands corresponding to the position of the fragments on the membrane, which reveals the distance they travelled through the gel, and thus their relative lengths. The bands appear in columns, with each column corresponding to certain forensic DNA samples and known samples of DNA from a particular person, such as the alleged perpetrator and/or victim, as well as control samples and "blanks" included for quality control purposes (discussed infra). Usually, there are two bands per column because there are two fragments of different length at each locus tested (i.e., one inherited from each parent). Giusti Aff. ¶ 55. See Appendix C.
The position of the bands on the membrane indicate the length of the VNTR segment relative to the bands of the sizing ladder, and can be expressed as an approximate number of base pairs. This process, known as autoradiography, is typically repeated for four to six loci or more,[5] and takes several days for each locus tested. The entire process can thus consume several weeks or months, and presents both health hazards and problems of radioactive waste disposal. Id. at 2-9, 2-11.
E. Genetic Matching
The sixth stage involves comparative analysis of the known and forensic samples to determine whether or not they could have originated from the same source. This requires first that the several columns of bands be compared visually. Bands that appear at the same position along the length of the membrane as other bands pictorially represent VNTR fragments of the same length. If the bands from a forensic sample do not appear to occupy the same position as the bands from a known DNA donor, the latter may immediately be excluded as the source of the forensic sample. If they do appear visually to occupy the same position, then the known sample is consistent with the forensic sample; however, a more refined analysis is performed using a computer to determine whether there is a "match." The computer determines the approximate size of fragments producing the observed bands by comparing them against the sizing ladders. Because there is some inevitable degree of measurement imprecision that does not permit for exact measurement of fragment length, the FBI has adopted a +/-2.5% window which allows a match to be declared "if the [length, or size] difference between the fragments being compared is 5% or less of the mean (or average) of the size of the two fragments being compared." [Giusti Aff. ¶ 58]. See Jakobetz, 955 F.2d at 793.
F. The Fixed Bin System of Matching
The final step in the RFLP process involves statistical analysis of the significance of such a "match," using what is known as the "fixed bin" system. Statistical analysis is employed to determine the frequency at which a given DNA profile appears in a given population, in order to generate a figure representing the probability that a "match" would occur if the known and forensic samples were not from the same individual, i.e., the likelihood of a coincidental match. Using autoradiography, the FBI has developed databases for this purpose for four populations: Caucasian, Black, Hispanic, and Native American,[6] which contain information about the frequency with which fragment sizes appear in these general populations.
For statistical purposes, fragment sizes prevalent in the database populations are grouped in "bins," with upper and lower boundaries corresponding to a range of base pairs (e.g., a hypothetical bin might have boundaries from XXXX-XXXX base pairs). The range of fragment sizes within each bin display a particular frequency with which they appear in the populations listed above. Under its old protocol using autoradiography, the FBI would assign a "matching" DNA fragment to the bin with the highest population frequency within 5% (i.e., +/-2.5%) of *407 its measured value, in order to take account of known measurement imprecision and reach a more conservative (i.e., favorable to the defendant) statistical estimate than it would otherwise reach. [G. Aff. ¶ 70-77]. See Jakobetz, 955 F.2d at 793-94; 1996 NRC Rep. at 5-18 to 5-20. This is known as the "uncertainty window," id. at O-11, or "bin search window."
G. The Frequency Estimate
Finally, an overall statistical estimate is produced that purports to show the incidence at which a person with the same DNA features would be expected to appear in the database populations against which the fragments were compared. The 1996 NRC Report defines random-match probability as follows:
Suppose that a DNA sample from a crime scene and one from a suspect are compared, and the two profiles match at every locus tested. Either the suspect left the DNA or someone else did. We want to evaluate the probability of finding this profile in the "someone else" case. That person is assumed to be a random member of the population of possible suspects. So we calculate the frequency of the profile in the most relevant population or populations. The frequency can be called the random-match probability, and it can be regarded as an estimate of the answer to the question: What is the probability that a person other than the suspect randomly selected from the population, will have this profile? The smaller that probability, the greater the likelihood that the two DNA samples came from the same person.
1996 NRC Rep. at 5-3. See United States v. Chischilly, 30 F.3d 1144, 1157 (9th Cir.1994), cert. denied, ___ U.S. ___, 115 S.Ct. 946, 130 L.Ed.2d 890 (1995).
This DNA "profile" is arrived at using a mathematical calculation known as the "product rule." 1996 NRC Rep. at O-20. Once the "matches" have been sorted and binned, the resulting population frequency figures for each of the alleles analyzed are multiplied together, generating an overall "match probability" figure that is more reflective of the expected population frequency for the genetic sample tested than the frequency figure for any single allele alone.[7]
The validity of the product rule in this context depends upon the assumption that the alleles analyzed are statistically independent (i.e., the presence of a fragment of particular length does not result in it being more or less likely that another fragment of a particular length also will appear in that DNA sample). See 1996 NRC Rep. at O-20. See also Cauthron, 846 P.2d at 513. The statistical independence of alleles within a particular locus is known as Hardy-Weinberg equilibrium; statistical independence across loci is known as linkage equilibrium. See 1996 NRC Rep. at O-19.
H. The Confidence Interval
The NRC has advocated a final conservative measure to deal with the statistical uncertainty inherent in an analysis of this sort: it is "safe to assume that the uncertainty of a profile frequency calculated by our procedures from adequate databases (at least several hundred persons) is less than a factor of about 10 in either direction." Id. at O-20.
We conclude that, when several loci are used, the probability of a coincidental match is very small and that properly calculated match probabilities are correct within a factor of about 10 either way.
Id. at O-27. This "ten × factor" is also referred to as a "confidence interval." Both the government and defense experts agree that this NRC proposal fairly eliminates concerns about uncertainties in the binning process. Earlier concerns about allelic independence led to the interim adoption of a modification to the product rule known as the "ceiling principle." This modification was merely a temporary conservative measure adopted to allow for the continued admission of DNA evidence derived through use of the product rule until the debate *408 about population substructure could be resolved. See Id. at O-21. As the interim principle was roundly criticized as arbitrary and unscientific, see 1996 NRC Rep. at O-27, the 10 × factor, or confidence interval, is now generally regarded as a suitable alternative for dealing with the uncertainty that underlies the statistical approach utilized in DNA analysis.[8]See also id. at O-26-27.
I. October 1995 Changes to FBI Protocol
In October of 1995 the FBI introduced a new protocol for RFLP testing procedures which included: (1) elimination of autoradiography in favor of a detection system using chemiluminescencethat is, the radioactive molecules were replaced by luminescent ones to expose the film and thus mark the relative positions of the DNA fragments on the nylon membrane, (2) elimination of ethidium bromide from the gel, (3) use of a "longer" gel, and (4) reliance on a different sizing ladder for visual comparison of the lumigraphs. This new protocol was used in generating the DNA profiles at issue in this case. In every other respect, however, the procedures outlined above were followed.
1. Chemiluminescence
Research efforts on the suitability of chemiluminescence as a substitute for the more expensive, time-consuming and hazardous radioactive technique used in RFLP analysis began at the FBI in 1993. In October of 1995, after internal validation studies had been satisfactorily performed, the technique was formally approved for forensic use. In November, 1995, the FBI's DNA Unit discontinued use of autoradiography in forensic casework altogether. (Giusti Aff. ¶ 95-96).
The change to chemiluminescent detection involves only the hybridization stage of RFLP analysis. Probes bonded with alkaline phosphatase are used rather than radio-active probes. Once the probe bonds with the DNA fragments on the nylon membrane, the membrane is saturated with a chemiluminescent substrate that causes the probes to emit light. As with autoradiography, X-ray film is then placed over the saturated membrane and left to develop. Unlike autoradiography, which can take several weeks to complete and creates problems with radioactive waste disposal and potential health risks to lab technicians, chemiluminescence may be performed quickly and safely. The resulting product, a lumigraph, closely resembles an autorad, and is then visually compared, measured and statistically analyzed according to the same procedures detailed above. (Giusti Aff. ¶ 52-53). Chemiluminescence is a better detection method because it produces a crisper, clearer image than the autoradiographs.
2. Other Changes to the Protocol
In conjunction with its switch to chemiluminescence, the FBI also began using a "longer" gel in the electrophoresis stage, which resulted in better separation of the larger pieces of DNA; discontinued use of ethidium bromide (a carcinogen) in the gel; and introduced different sizing markers (or ladders) more suitable for sizing bands illuminated by lumigraphs than those that had been used for sizing bands on the autorads.
J. Polymerase Chain Reaction Analysis ("PCR")
PCR analysis is an alternative means of analyzing and comparing forensic DNA samples with samples of known origin. The process is important for forensics because it permits DNA profiling of samples containing much smaller quantities of DNAsuch as saliva on a cigarette buttthat cannot be tested via the RFLP method, and because test results are available far more rapidly. *409 1996 NRC Rep. at 2-13. [Giusti test., Day 2]. Indeed, defendants in criminal cases have been known to be as interested in securing its use in this context as prosecutors. See People v. McSherry, 14 Cal.Rptr.2d 630 (Cal.App. 2 Dist.1992) rev. denied, Mar. 19, 1993 (unpublished opinion); Commonwealth v. Francis, 436 Pa.Super. 456, 648 A.2d 49 (1994); Commonwealth v. Curnin, 409 Mass. 218, 222, 565 N.E.2d 440 (1991).
With the PCR process, particular short segments of DNA are copied millions of times, in a process similar to that by which DNA duplicates itself naturally.[9] 1996 NRC Rep. at O-16, 2-11. "PCR is like a genetic photocopy machine," that "mimics DNA's self-replicating properties to make up to millions of copies of the original DNA sample in only a few hours." State v. Moeller, 548 N.W.2d 465, 480 (S.D.1996) (citation omitted). The amplification process is relatively simple, can be quickly carried out in the laboratory, often within 24 hours, is suitable for analyzing degraded DNA, and permits analysis of very small forensic samples that are not amenable to RFLP testing. 1996 NRC Rep. at 2-11. [Giusti Aff. ¶ 146(b)]. Moreover, most PCR tests permit exact identification of each allele at a particular locus, eliminating the measurement imprecision of RFLP, and hence the need for statistical analysis involving matching and binning of VNTR length measurements. 1996 NRC Rep. at 2-11.
After the target segment is replicated, different analyses are performed depending on the locus being probed. There are three loci tested in this case: (1) the DQA1; (2) the Polymarker loci; and (3) the D1S80. For the DQA1 and the five Polymarker ("PM") loci, dot blots are created by the use of DNA probes designed to detect particular sequence variations in the amplified portions of these DNA segments, which permits visual comparison of the sequences of six different gene fragments, and determination of DNA type(s) that are present in a sample. [Giusti Aff. ¶ 141 and 150]. For the D1S80 locus, a process of gel electrophoresis similar in principle to that used in RFLP analysis is performed, and comparisons based on length variations are made between the samples. [Giusti Aff. ¶ 150].
For all three markers, if there is no visual "match" between the depictions of the known and unknown DNA fragments, or if results are inconclusive, an exclusion is declared and the analysis is complete. If, on the other hand, a visual correlation appears, the analysis requires a second step involving statistical determination of "the significance of the matching types obtained." [Giusti Aff. ¶ 141]. As with RFLP analysis, statistical analysis is performed on the PCR test results based upon population databases compiled for the particular markers tested (Caucasian, Black, Southeastern and Southwestern Hispanic).[10] The probability of a *410 random match for a particular sequence is estimated from the frequency that sequence appears in each database. 1996 NRC Rep. at O-23-26. [Giusti Aff. ¶ 155]
PCR analysis is less useful as a tool for positive identification if performed on a single locus because the variation between individuals at a single marker is not very great. For example, about 7% of the population at large has the same DQA type, 1996 NRC Rep. at O-16, which is useful information for quickly clearing a wrongly accused person, but does not itself give particularly strong evidence for positive identification. However, the likelihood that two individuals will share the same sequence and length polymorphisms across several unrelated loci drops dramatically, and very small population frequencies may be calculated. Id. at 6-11 and n. 35. [Giusti test., dir. and cross, Days 2-3]. Therefore forensic PCR analysis is typically conducted across multiple loci and the product rule, applied to the alleles measured in RFLP analysis, is also applied here: the frequency or "match probability" for each locus tested in a particular sample is multiplied together to generate an overall probability that this particular genetic profile will occur in the database populations. Id. at O-20. The same confidence interval discussed in connection with RFLP analysis is then applied to express this overall match probability as a range that could vary by a factor of ten in either direction.
One disadvantage of the PCR method is that the procedure is particularly susceptible to sample contamination, which could confound interpretation of the results and lead to a possibly erroneous conclusion. Id. O-16. Second, because there is less variation between individuals in the make-up of alleles within each of the small segments amenable to the PCR process, more loci must be tested to produce the same degree of discrimination among individuals as the RFLP method generates. Id. This in turn requires that the markers chosen for analysis be "independent," i.e., that they are not associated with disease-causing genes, id. at O-16, genes that influence mate selection (such as height), id. at O-19, or each other. [Giusti test., dir. and cross, Days 2-3]. However, "[t]hese are all problems that can be minimized by proper choice of markers and by care and good technique." Id. at O-16.
IV. Discussion
A. Standard For Admissibility of Scientific Evidence
Under the Federal Rules of Evidence, "the trial judge must ensure that any and all scientific testimony or evidence admitted is not only relevant, but reliable." Daubert v. Merrell Dow Pharmaceuticals, Inc., 509 U.S. 579, 589, 113 S.Ct. 2786, 2795, 125 L.Ed.2d 469 (1993). Under Rule 702,[11] the standard of evidentiary reliability for scientific evidence is "based upon scientific validity." Id. at 590 and n. 9, 113 S.Ct. at 2795 and n. 9. Daubert requires that a court considering the admissibility of expert scientific testimony under Rule 702 must make "a preliminary assessment of whether the reasoning or methodology underlying the testimony is scientifically valid and of whether that reasoning or methodology properly can be applied to the facts in issue." Id. at 592-93, 113 S.Ct. at 2796.
Although the Court in Daubert deliberately avoided prescribing a definitive test, id. at 593, 113 S.Ct. at 2796-97, it did outline several non-exclusive factors for consideration in making this assessment. "A key question" is whether the theory, principle or scientific technique has been empirically tested. Id. A "pertinent consideration" is whether it "has been subjected to peer review and publication." Id. Publication, but one element of peer review, is "not a sine qua non of admissibility," and "does not necessarily correlate with reliability." Id. Additionally, a court "ordinarily should consider the known *411 or potential rate of error," id. at 594, 113 S.Ct. at 2797, and the "existence and maintenance of standards controlling the technique's operation." Id. Finally, general acceptance can have a bearing on admissibility, but a reliability assessment "does not require ... explicit identification of a relevant scientific community and an express determination of a particular degree of acceptance within that community." Id. (internal quotation and citation omitted). The goal of this "flexible" inquiry is to establish "the scientific validityand thus the evidentiary relevance and reliabilityof the principles that underlie a proposed submission." Id. at 594-95, 113 S.Ct. at 2797. Ultimately, the Court must also be "mindful" of the "danger of unfair prejudice, confusion of the issues, or [potential for] misleading the jury." Id. at 595, 113 S.Ct. at 2798 (quoting Rule 403).
B. Admissibility of RFLP Analysis in Federal Court
In determining whether RFLP analysis of forensic DNA evidence for identification purposes satisfies Daubert, this Court fortunately does not write on a clean gel. Although the First Circuit has not yet addressed the issue, most federal courts considering the question have held that RFLP analysis satisfies the standards elucidated above. See United States v. Davis, 40 F.3d 1069, 1074-76 (10th Cir.1994), cert. denied, ___ U.S. ___, 115 S.Ct. 1387, 131 L.Ed.2d 239 and cert. denied, ___ U.S. ___, 115 S.Ct. 1806, 131 L.Ed.2d 732 (1995); United States v. Chischilly, 30 F.3d 1144, 1152-58 (9th Cir.1994), cert. denied, ___ U.S. ___, 115 S.Ct. 946, 130 L.Ed.2d 890 (1995); United States v. Bonds, 12 F.3d 540, 550-68 (6th Cir.1993); United States v. Martinez, 3 F.3d 1191, 1195-98 (8th Cir.1993), cert. denied, 510 U.S. 1062, 114 S.Ct. 734, 126 L.Ed.2d 697 (1994); Jakobetz, 955 F.2d 786 (2d Cir.1992) (admitting evidence under pre-Daubert standard approximating Daubert's flexible analysis); United States v. Coronado-Cervantes, 912 F.Supp. 497 (D.N.M.1996). The Massachusetts Supreme Judicial Court has reached the same conclusion. See Commonwealth v. Lanigan, 419 Mass. 15, 641 N.E.2d 1342 (1994) (adopting Daubert under Massachusetts law and admitting RFLP test results in criminal case). Based on this solid phalanx of state and federal caselaw, the 1996 NRC report and the evidence at the Daubert hearing, this Court concludes that the RFLP methodology is reliable. Indeed, defense counsel did not launch a challenge to RFLP methodology generally, but focused on the specific changes to the FBI protocol.
1. Chemiluminescence and the FBI's New Protocol
Although the RFLP profiling procedure has been widely accepted and scientifically validated under Daubert, the use of chemiluminescence rather than autoradiography in the detection phase of forensic analysis is relatively new. As of this writing, no court, state or federal, has considered whether the use of chemiluminescent detection in RFLP analysis is scientifically valid, and therefore, admissible. Thus, this question is one of first impression.
Lowe argues that the substitution of chemiluminescence and the other protocol changes introduced to the FBI's RFLP analysis in October of 1995 are significant enough to require Daubert analysis for admissibility. He urges the Court to find the new protocol wanting under Daubert, particularly with respect to the use of chemiluminescence, because the FBI did not do adequate validation studies prior to implementation, has not published or otherwise submitted for peer review the one comprehensive study that it did conduct (the Headquarters Study), and cannot demonstrate that chemiluminescence is generally accepted. Lowe also urges as an additional ground for exclusion the FBI's alleged failure to document, compile and compute laboratory error rates for its RFLP procedure.
The Government disputes that the switch to chemiluminescence should occasion a full-blown Daubert analysis, as, in its view, it is simply a safer, cleaner and better technique for viewing band lengths, and has absolutely no effect on the validity of the overall methodology. It argues that the other changes are simply incidental to the move away from autoradiography, and bring the FBI's RFLP protocol more in line with the *412 rest of the forensics community. Pursuant to Fed.R.Evid. 104, 702 and 703, regardless of who wins this semantic debate, this Court must conduct a threshold evaluation of the new protocol to ensure reliability. While the protocol may not rise to the heights of a new scientific "methodology," the Daubert factors are helpful in determining its reliability.
a. Testing
Testing of chemiluminescence for clinical use with RFLP began in the early 1990s, and the results of these tests have been peer-reviewed and published. Gov. Exh. 9. The FBI has been a pioneer in expanding the use of chemiluminescence from a clinical to a forensic setting. The FBI conducted two validation studies on chemiluminescence, one of which has been published in the peer-reviewed literature (Quantico Study), and the other presented orally at the 1996 annual meeting of the American Academy of Forensic Sciences (Headquarters Study). Both showed that chemiluminescent detection results in population frequency figures substantially similar to those obtained with radioactive detection.
The FBI's Quantico study compared visual "match" results obtained with chemiluminescent RFLP analysis on non-forensic, or "pristine," DNA samples to those obtained with autoradiography, using the same match criterion, and concluded that chemiluminescence-based procedures "are sufficiently similar to existing procedures, so that there would be minimal disruption in laboratory protocol due to the introduction of the [new] procedures." Giusti, A. and Budowle, B., A Chemiluminescence-Based Detection System for Human DNA Quantitation and Restriction Fragment Length Polymorphism (RFLP) Analysis, 5 Applied and Theoretical Electrophoresis 89-98 (1995) (Gov.Exh. 2).[12] Both Mr. Giusti and Dr. Tracey testified credibly that the same band length information, with equal or better sensitivity, was exposed using lumigraphs as was obtained using autorads. (See Appendix A).
The Headquarters Studynot yet published compared computer-generated band length measurements and statistical population frequencies obtained using chemiluminescence, as well as the three other protocol changes challenged here, on 113 casework samples upon which RFLP analysis under the old protocol had been performed. Although there was "excellent" agreement between autoradiography and chemiluminescence for band length measurement, the FBI did find some "overstatement" of fragment values of one to two percent using lumigraphs, as compared to autorads, particularly with larger fragments (those in excess of ten thousand base pairs). However, 97% of the population frequency results obtained with chemiluminescence were within 3% of the frequencies obtained with radiography, and those results that displayed a greater than 3% difference were for VNTRs containing more than 10,000 base-pairs, found at two particular loci (markers D1S7 and D4S139).
Although fragment length values obtained from lumigraphs and autorads could vary, the FBI argues that any variance was not significant with respect to the reliability of the FBI's RFLP methodology using the new protocol in determining band length. Giusti pointed out that there is a greater degree of measurement imprecision for larger fragments regardless of the detection method used, and that for these two loci RFLP analysis can display variations in excess of 5% on samples from the same individual. Thus, he does not attribute measurement imprecision for large fragments to the method of detection. The FBI does not generally rely on fragment measurements for VNTRs greater than 10,000 base-pairs for forensic analysis because measurement imprecision for DNA fragments of such size is expected under any methodology.
Second, the Headquarters Study also found that the corresponding population frequencies generated from the fixed-bin stage of the analysis, or DNA profiles, varied somewhat when chemiluminescent detection was substituted for autoradiography. Specifically, *413 about 50% of the population frequencies obtained under the new protocol were statistically the same, about 25% were less rare, and about 25% were more rare than those obtained under the old protocol (but "not by more than a factor of two," i.e., one in a million versus one in 500,000). [Giusti dir., Day 2] The FBI concluded, however, that if the bin search window was doubled from +/-2.5% to +/-5%, then the population frequency profiles obtained under a chemiluminescence-based analysis were never more rare than those generated using radioactive detection. [Giusti test., Day 3, G. Aff. ¶ 100].
With the expanded bin search window, a DNA fragment is assigned the band measurement bin having the highest population frequency within a ten, rather than five, percent range of its band length measurement. Mr. Giusti testified that doubling the range within which to search for a higher population frequency, always operates conservatively in the defendant's favor. Dr. Tracey confirmed that the expanded bin search window would "absolutely" account for any variation and assure that use of a chemiluminescent system would never result in a profile more prejudicial to a defendant than the old system.
After an opportunity for discovery of the government's test results, Defendant's expert, Dr. Krane, did not dispute that lumigraphs display the same band length information with equal or greater clarity than autorads. He questioned the FBI's claimed similarity between the computer-generated band length measurements from the chemiluminescent and radioactive RFLP procedures, by testifying, contrary to Mr. Giusti, that in his comparison of the FBI's data[13] he observed that the band measurements from lumigraphs were consistently less than those obtained from autorads on the same samples, although "not by very much." He was unable to conclude, however, that the reliability of the FBI's RFLP analysis was significantly compromised by this observed effect. He further conceded that the FBI's expanded bin search window would "probably" take account of any systemic bias he had perceived "quite easily," although he expressed a desire to see further statistical validation of the FBI's data.
Based on the testimony of Giusti, Tracey and Krane, the Court concludes that the other protocol changes do not affect the admissibility of the RFLP test results so along as the expanded bin search window is used. With respect to the elimination of ethidium bromide, the FBI's new protocol brings it in line with the procedures used by most other forensic scientific laboratories. See Exh. 19A. Although DNA fragments might not move as quickly through a gel containing ethidium bromide as they would through a gel without it, the elimination has no significant effect on the measurement precision of the system because all of the DNA run through the gel, including the sizing ladders used for measurement, is affected equally. With regard to the use of a longer gel, the FBI's validation study confirms that this change would have no significant impact on the analysis of alleles with less than 10,000 base pairs. According to Tracey and Giusti, these changes would not affect any visual matches although they could affect the computer calculation by two to three percent. According to Tracey, the expanded binning window to determine frequency would accommodate any such variation.
The new sizing ladders were designed exactly the same way as the old, but are more appropriate for use with lumigraphs; "fragment length is known as absolutely from one ladder to another." Furthermore, the new sizing ladders were part of the FBI's published validation study involving chemiluminescence, Gov. Exh. 2, as well as the Headquarters Study that was presented at the American Academy of Forensic Sciences, both of which demonstrated that the changes had no significant impact on the reliability of the RFLP process. (See Ex. 19A). All of the protocol changes were encompassed by the Headquarters Study, which served to validate the entirety of the new protocol.
*414 b. Peer Review
The use of luminescent molecules has been published in peer reviewed litigation. The FBI's Quantico Study concluded that chemiluminescence could be substituted for autoradiography in RFLP analysis with improvements in resolution, quality control, versatility and robustness, as well as considerable reductions in time, cost and hazard, and without any loss of sensitivity. It was published in Applied and Theoretical Electrophoresis, a peer-reviewed periodical, in December of 1995, although it was submitted for publication, and thus to the peer-review process, almost one year earlier. (Gov.Exh. 2).
The Headquarters Study, focusing on variation in band length measurements and statistical profiles, and the data concluding that an expanded bin search window is appropriate as a conservative measure, has not been published in a peer review publication. However, the data, results and conclusions of the Headquarters Study were presented at a February 1996 meeting of the American Academy of Forensic Science, which is considered part of the peer review process.
The Army independently conducted and published in a peer-reviewed periodical a validation study for chemiluminescence similar to the FBI's Headquarters Study, and also concluded that there is no significant difference between the use of lumigraphs and autorads in RFLP analysis. Johnson, E. and Katowski, T.M., Chemiluminescent Detection of RFLP Patterns in Forensic DNA Analysis, 41 Journal of Forensic Sciences 569 (1996) (Exh. 6) (concluding that the chemiluminiscent detection system is well suited to use in forensic casework).[14]
c. General Acceptance By Forensic Community
Chemiluminescence is widely accepted in the forensic scientific community. Validation studies have been conducted by both major British forensic law enforcement laboratories (including Scotland Yard) and by two leading commercial manufacturers of RFLP forensic testing kits in America, Cellmark Diagnostics and Lifecodes Corporation. All of these laboratories switched from autoradiography to chemiluminescence prior to the FBI's decision to do so. Chemiluminescence is also used the U.S. Army and Spain. The 1996 NRC Report comments that "some laboratories are beginning to use luminescent molecules as labels on their probes." 1996 NRC Rep. at 2-11. Finally, a national consortium of forensic labs (TWGDAM) reports that an August 1996 survey it conducted revealed that 38% of forensic labs responding currently use chemiluminescent detection in RFLP DNA analysis, and a total of 71% will be using it by February of 1997. (Gov.Exh. 7).
d. Error Rate and Laboratory Standards
Daubert suggests that "known or potential error rate" in the performance of a particular scientific technique should "ordinarily" be considered in the process of determining admissibility. 509 U.S. at 594, 113 S.Ct. at 2797.
The FBI does not compute a systemic laboratory error rate. Giusti contends this is because the FBI has controls in place that are designed to catch and correct errors when they occur, thus preventing systemic error from becoming established at any particular "rate." Lowe insists[15] that Daubert does not give the FBI laboratory the sanguinity of claiming that it is error free, and that even if the FBI catches all of its errors and prevents them from recurring, he has the right to know how often such errors occur. Lowe challenges both the reliability of the evidence per se, and the fairness (i.e., prejudice) of presenting it to the jury without also including an error rate calculation to enable the jury to fairly weigh its probativeness.
*415 With respect to reliability, Lowe has cited no federal cases, and the Court can locate none, that have excluded DNA evidence on account of a theoretical rate of error alone. There is no evidence that the switch to chemiluminescence somehow involves any greater risk of laboratory error. The FBI's validation studies show to the contrary.
To the extent Lowe argues that match probability should only be admitted when laboratory error rate can also be disclosed, the Court heard persuasive testimony from Dr. Tracey that formulaic error "rates" can in fact be misleadingly high, and also present problems of confusing the jury. According to the NRC:
Some commentators have argued that the probability of a laboratory error leading to a reported match for samples from different individuals should be estimated and combined with the probability of randomly drawing a matching profile from the population. ... We believe this approach to be ill advised. It is difficult to arrive at a meaningful and accurate estimate of the risk of such laboratory errors.
1996 NRC Rep. at O-17 (Emphasis added). The NRC instead emphasizes the importance of conforming to established protocol for quality control, id. at 3-11, and recommends that "high quality standards" be followed, "such as those defined by TWGDAM and the DNA Advisory Board." Id. at 3-12.
The DNA Identification Act of 1994, 42 U.S.C. § 14131(a) ("the Act") established the DNA Advisory Board for the purpose of overseeing the setting of national DNA criteria for quality assurance and proficiency tests to be applied to various types of DNA analyses used by forensic laboratories. See 1996 NRC Rep. at 3-11. TWGDAM stands for the Technical Working Group on DNA Analysis Methods, to which the FBI laboratory belongs. The FBI is required to follow the TWGDAM quality assurance guidelines until the FBI Director has acted upon recommendation of the DNA Advisory Group. 42 U.S.C. § 14131(a)(4).
The Act addresses the issue of "blind external proficiency tests" defined as "a test that is presented to a forensic laboratory through a second agency and appears to the analysts to involve routine evidence". By September 13, 1995, the Director of the National Institute of Justice ("NIJ") was required to certify to the House and Senate Committees on the Judiciary that by September 13, 1996 a "blind external proficiency testing program for DNA analysis" would be made available to public and private laboratories, that it was already readily available, or that it was not feasible to have blind external testing for DNA forensic analyses. There is no evidence before this Court as to whether the report was filed. According to the 1996 NRC report, the desirability of requiring blind external testing is still being studied by the NIJ.
Lowe has not argued, and the Court has no independent reason to believe, that the TWGDAM guidelines were not adhered to in this case,[16] or that any national standards have been violated. FBI technicians routinely undergo (twice per year), and have routinely passed internally and externally administered proficiency testing, both open and blind, with respect to RFLN testing. Giusti himself has undergone two open and one blind proficiency test involving the new FBI protocol using chemiluminescence, and has received information from the FBI Quality Control Unit that he correctly matched all samples in the test.[17] No federal or state cases of which this Court is aware suggest that adherence to protocol has been a problem at the FBI. Compare People v. Castro, 144 Misc.2d 956, 974-77, 545 N.Y.S.2d 985, 996-98 (Sup.Ct.1989) (private lab failed to comply with its own guidelines); State v. Schwartz, 447 N.W.2d 422, 428 (Minn.1989) (private lab did not comply with TWGDAM guidelines).
*416 According to the NRC, "[a] wrongly accused person's best insurance against the possibility of being falsely incriminated is the opportunity to have the testing repeated. Such an opportunity should be provided whenever possible." 1996 NRC Rep. at 3-11. Lowe was provided with split samples and such an opportunity to test them himself in this case.
e. Conclusion
Based on the foregoing, the Court concludes that: (1) the chemiluminescent detection procedure can be and has been adequately tested, and has been sufficiently validated and accepted by the relevant scientific community for forensic use in RFLP analysis; (2) the other protocol changes have had no significant impact on the reliability of the FBI's RFLP procedures; (3) the FBI's expansion of the bin search window for lumigraph measurements yields a more conservative DNA profile than the old approach and accommodates for any measurement variations caused by the new detection method; and (4) in the absence of any evidence that the FBI laboratory failed to adhere to required Quality Control protocol in this or other cases, lack of an ascertainable laboratory error rate for the use of chemiluminescence in the RFLP process does not weigh against admission of the evidence, but could impact its weight. In light of this, and the fact that the RFLP process itself has already been admitted under Daubert in numerous federal cases, the Court concludes that the RFLP test results under the new protocol are admissible.
C. PCR Analysis
Lowe does not challenge the reliability of PCR methodology in general. The PCR process for DNA amplification and analysis generally, and its advantages for detection of genetic variation in the forensic context, have been the subject of numerous scientific papers in the peer-reviewed scientific literature. [G. Aff. ¶ 145-148]. The FBI began investigating the forensic capacity of the PCR methodology for various loci in late 1988, and testing the technique for forensic use in 1989. [Giusti test., Day 2]. The FBI began using the PCR process for forensic casework with the DQA1 marker in April of 1992, added the PM loci in August of 1994, and the D1S80 locus in October of 1995. The 1996 NRC Report points out "PCR-based typing is widely and increasingly used in forensic DNA laboratories in this country and abroad." (P. 2-11).
Although no federal court has yet examined PCR methodology under Daubert, courts in at least sixteen states, as well as the U.S. Air Force Court of Criminal Appeals, have admitted the results of PCR testing under either a Daubert- or Frye-type analysis. See Harmon v. State, 908 P.2d 434, 438-42 (Alaska Ct.App.1995) (allowing unspecified PCR test under Frye); Seritt v. State, 647 So.2d 1 (Ala.Ct.Cr.App.1994) (allowing DQA test under Frye); People v. Morganti, 43 Cal.App.4th 643, 50 Cal.Rptr.2d 837, 849-55 (1996) (allowing DQA test under Frye); Redding v. State, 219 Ga.App. 182, 464 S.E.2d 824 (1995) (allowing DQA and D1S80 under Daubert variant); State v. Hill, 257 Kan. 774, 895 P.2d 1238, 1246-47 (1995) (allowing DQA test under Daubert variant); State v. Spencer, 663 So.2d 271, 275-75 (La. Ct.App.1995) (allowing unspecified PCR test under Frye); People v. Lee, 212 Mich.App. 228, 537 N.W.2d 233, 251 (1995), appeal denied, ___ Mich. ___, 554 N.W.2d 12 (1996) (allowing DQA test under Frye); State v. Grayson, No. K2-94-1298, 1994 WL 670312, at *1-5 (Minn.Dist.Ct. Nov. 8, 1994) (unpublished) (allowing DQA test under Frye and Daubert); State v. Moore, 268 Mont. 20, 885 P.2d 457, 467-68, 474-75 (1994) (allowing DQA test under Daubert); State v. Williams, 252 N.J.Super. 369, 599 A.2d 960, 966-67 (Law Div.1991) (allowing DQA test under Frye); People v. Morales, ___ A.D.2d ___, 643 N.Y.S.2d 217, 218-19 (1996) (allowing DQA and Polymarker tests under Frye); State v. Penton, No. 9-91-25, 1993 WL 102507, at *4-5 (Ohio Ct.App. Apr. 7, 1993) (unpublished) (allowing DQA test under Daubert variant); State v. Lyons, 324 Or. 256, 924 P.2d 802 (1996) (allowing DQA test under Daubert variant); State v. Moeller, 548 N.W.2d 465, 479-83 (S.D.1996) (allowing DQA test under Daubert); State v. Begley, No. O1C01-9411-CR-00381, 1996 WL 12152, at *5-7 (Tenn.Crim.App. Jan. 11, 1996) (unpublished) (allowing unspecified PCR test *417 under statute modelled on Rule 702); Campbell v. State, 910 S.W.2d 475, 478 (Tex.Crim. App.1995) (en banc) (allowing DQA test under Daubert), cert. denied, ___ U.S. ___, 116 S.Ct. 1430, 134 L.Ed.2d 552 (1996); Spencer v. Commonwealth, 240 Va. 78, 393 S.E.2d 609, 620-21 (allowing DQA test under Daubert variant), cert. denied, 498 U.S. 908, 111 S.Ct. 281, 112 L.Ed.2d 235 (1990); State v. Gentry, 125 Wash.2d 570, 888 P.2d 1105, 1117-18 (en banc) (allowing DQA test under Frye), cert. denied, ___ U.S. ___, 116 S.Ct. 131, 133 L.Ed.2d 79 (1995); United States v. Thomas, 43 M.J. 626, 633-634 (1995) (allowing Amp-FLP PCR marker test under Daubert). But see State v. Carter, 246 Neb. 953, 524 N.W.2d 763 (1994) (excluding DQA test results on ground that statistical evidence and analysis flawed). In all of the above cases admitting PCR-based testing, the courts held that the methodology had been sufficiently scientifically validated as reliable and/or was generally accepted among forensic scientists for the results to be admitted at trial.
Lowe challenges the FBI's PCR analysis for the following reasons: (1) the Polymarker and D1S80 tests have not been scientifically validated or generally accepted;[18] (2) "independence" between the three loci has not been adequately demonstrated to permit use of the product rule to combine their results; (3) the Government has not demonstrated that the FBI has adequate controls in place to account for the susceptibility of the PCR methodology to contamination where the FBI refuses to seek outside accreditation; and (4) the FBI has failed to establish laboratory error rates for any of the PCR tests it performs. The Court addresses each argument in turn.
1. Scientific Validity of Polymarker and D1S80 Tests
While conceding that the DQA1 test has been widely accepted by the forensic community, Lowe maintains that the Polymarker and D1S80 tests are simply too new for their reliability to have been determined through a peer-review process. At least one state court has admitted results of the Polymarker test, see Morales, 643 N.Y.S.2d at 217, and at least one has admitted the results of D1S80 testing.[19]See Redding, 464 S.E.2d at 824 (affirming acceptance of D1S80 tests administered by one lab and rejection, for reasons that are unclear, of PCR tests administered by another).
The challenged Polymarker and D1S80 tests conducted in this case have been validated for forensic DNA analysis. (G. Aff. ¶ 152; Gov. exh. 13-16). See 1996 NRC Rep. at 2-13; G. Aff. ¶ 165 (citing published studies validating all three of the PCR-based DNA typing methods employed by the FBI). The FBI's validation and population studies covered all three of the PCR marker tests employed in this case and showed their reliability under forensic conditions (robustness). All were submitted to and published in peer-reviewed scientific literature, and no evidence was presented to this Court questioning either the reliability of the studies, or of the test procedures themselves. The 1996 NRC Report states that the technology utilized in various types of PCR tests, including those challenged here, "is thoroughly sound and ... the results are highly reproducible when appropriate quality-control methods ar followed." 1996 NRC Rep. at O-16. It refers to PCR-based DNA testing as "widely and increasingly used in forensic DNA laboratories in this country and abroad," id. at 2-11, and specifically cites the Polymarker loci testing system as "widely used." Id. at 2-13. *418 It notes that D1S80 is "increasingly used," and that it has been validated both for robustness to environmental insults and for independence from other alleles. Id. at 4-31. At least one private forensic laboratory has also extensively used the D1S80 locus (Docket No. 123).
The defense expert Dr. Krane describes the AMP-FLP[20] D1S80 locus as "a novel application which shows great promise" but expresses concerns that the publication of the validation studies in peer-reviewed literature has coincided with the implementation of the methodology with insufficient time for scientific scrutiny by the forensic community. The government retorts that twelve peer-reviewed publications on D1S80 authored within and from outside the FBI demonstrate its reliability and point out the absence of any articles to the contrary.
Based on the favorable description by the National Research Council's Commission on Forensic DNA Science, the peer-reviewed studies, the expert testimony at the Daubert hearing, and the lack of any scientific evidence disputing the reliability of the PCR methodology at any of the three loci, the Court finds that the PCR methodology passes Daubert muster with respect to the DNA profiling at the Polymarker and D1S80 loci. The relative lack of experience with the D1S80 loci testing system (as contrasted with the other loci) may affect the weight of the evidence, but the government has demonstrated that the methodology is reliable.
2. Independence of the Loci and the Product Rule
Lowe's second challenge to the PCR test results involves the use of the product rule to combine the results of all three PCR tests to a single population frequency figure. The validity of using the product rule in this context depends upon the independence, or lack of association between the alleles or loci examined.
The FBI's validation studies demonstrate that there is "little evidence for association of the alleles between the loci" covered by the three PCR tests, including the D1S80. Budowle, B. et al., Validation and Population Studies of the Loci LDLR, GYPA, HBGG, D7S8, AND Gc (PM loci), and HLA-DQα Using a Multiplex Amplification and Typing Procedure 40 Journal of Forensic Sciences 45 (Jan. 1995). The peer-reviewed FBI study pointed out that "the data support that for the seven PCR-based loci the populations meet expectations of independence ..." Id. at 49-50. (Gov.Exh. 13). Finally, it concluded that, just like for the multiple allele profiles obtained in RFLP analysis, "valid estimates of a multiple locus profile frequency can be derived for identity testing purposes using the product rule...." Id. at 53.
Defendant relies on Dr. Tracey's prior testimony in another court proceeding that the test used by the FBI to determine independence between the DQ-Alpha, Polymarker and D1S80 loci was incapable of detecting a deviation of independence of up to thirty percent. Lowe's expert, Dr. Krane, testified that there was some evidence for association, "little as it is," and that this was enough to undermine the reliability of using the product rule to combine the results of these tests, because the validity of the product rule depends upon the assumption that there is no association. He went on to testify, though, that the confidence interval, or 10 × factor, would take account of the evidence of association he could discern in the FBI's data, at least for estimates at or below one in 500 million. Dr. Tracey agreed that the 10 × factor proposed by the NRC was conservative enough to account for any undetected association between the PCR loci, and that this conclusion is consistent with the view of the NRC. See 1996 NRC Rep. at O-20. While the theoretical possibility of an association may be as high as thirty percent, there is no actual evidence of any dependence sufficient to undermine the reliability of the product rule. Thus, while the possible association *419 between the loci may be fodder for cross-examination, there is sufficient reliability for admissibility.
3. FBI's Laboratory Standards
Lowe's final challenge centers on the alleged failure of the FBI's DNA Unit to estimate error rates for its PCR-based procedures, and to maintain adequate and ascertainable laboratory standards to control the particular problem of contamination associated with PCR-based DNA analysis.
a. Contamination
Lowe contends, and the Government does not dispute, that the amplification process renders PCR-based DNA analysis particularly susceptible to contamination, which could affect the reliability of results obtained. He raises concerns about both "cross contamination," involving the contamination of one sample by another during handling before, during or after DNA extraction, and "carryover contamination," which can occur when matter from previous PCR tests remains on or around the laboratory area utilized in the process, contaminating any future samples tested. He contends that the FBI's failure to seek and obtain outside accreditation means there is "essentially no assurance that the FBI lab is free from the very real dangers of contamination," (Doc. # 117 at pp. 11-12) and is therefore fatal to the reliability of the FBI's PCR procedures.
While there is a documented risk that amplification of contaminated DNA could lead to false positives and negatives, 1996 NRC Rep. at 2-12, the NRC has suggested laboratory methods that mitigate this concern. Id. at 3-7 to 3-9. Because "contamination control is a high priority" at the FBI's DNA Unit, the FBI has implemented a series of controls designed to prevent contamination of PCR samples once they reach the FBI laboratory,[21] which conforms to the guidelines adopted by TWGDAM, and is consistent with the precautions recommended by the NRC. [Giusti Aff. ¶ 158-164]. Again, Lowe introduced no evidence that the FBI was not in compliance with its statutory obligation to follow these guidelines, either generally, or with respect to the particular PCR tests conducted in this case.
b. Error Rate
The crux of Lowe's challenge here seems to be that the FBI's failure to undergo blind proficiency testing for its PCR-based tests prior to beginning to utilize them in casework impugns their reliability under Daubert. Citing the earlier 1992 NRC Report, he maintains that blind testing is "essential" to implementation of a new methodology for DNA analysis, and that because neither the D1S80 or Polymarker tests underwent such trials, their results should not be admitted. (Doc. # 117, pp. 6-7) He also raises again in this context the FBI's lack of outside accreditation.
FBI's DNA Unit requires its examiners to undergo regular proficiency testing two times per year. As of the evidentiary hearing in this case, FBI lab technicians had undergone a total of 48 "open" proficiency tests using PCR procedures since PCR was first introduced in 1994. All 48 were conducted for the DQA1 marker test, 24 of the exams covered Polymarker, and 12 covered D1S80. Twenty-six of these proficiency tests were administered internally, by the FBI's Quality Control Unit, a separate division from the DNA Unit, and 22 have been administered *420 externally, by Cellmark Diagnostics.[22]
Although "blind" proficiency tests are conducted for the FBI's RFLP process, no blind tests are conducted for its PCR process. The 1992 NRC report stressed:
Most important, there is no substitute for rigorous proficiency testing via blind trials. Such proficiency testing constitutes scientific confirmation that a laboratory's implementation of a method is valid not only in theory, but also in practice. No laboratory should let its results with a new DNA typing method be used in court, unless it has undergone such proficiency testing via blind trials.
1992 NRC Report at 5.
The 1996 version of the NRC report takes a softer approach:
In open proficiency tests, the analyst knows that a test is being conducted. In blind proficiency tests, the analyst does not know that a test is being conducted. A blind test is therefore more likely to detect such errors as might occur in routine operational However, the logistics of constructing fully blind proficiency tests [to ensure the laboratory will not suspect that it is being tested] are formidable. The "evidence" samples have to be submitted through an investigative agency so as to mimic a real case, and unless that is done very convincingly, a laboratory might well suspect it is being tested. Whichever kind of test is used, the results are reported and, if errors are made, needed corrective action is taken.
1996 NRC Rep. at 0-17. TWGDAM specifies only that DNA examiners undergo two open proficiency tests per year, and that the results, including corrective action taken, be documented. Id. at 3-4. TWGDAM recommends one full blind proficiency test per laboratory per year if such a program can be implemented.[23]Id. The FBI lab claims it is in compliance with TWGDAM requirements, but acknowledges it does not follow the recommendation for blind proficiency testing.
Although the FBI's explanation of why blind testing is not feasible for PCR testing but is feasible for RFLP testing is not persuasive, the lack of such testing is not determinative of admissibility. See United States v. Bonds, 12 F.3d at 560 (allowing FBI DNA profiling evidence although the "deficiencies in calculating the rate of error and the failure to conduct extensive blind proficiency tests are troubling."). The potential for and significance of contamination, the adequacy of proficiency testing, accreditation, and the significance of whether a laboratory estimates error rates all concern the issue of quality control. Absent evidence demonstrating that the particular quality control procedures followed by the FBI laboratory violated a statute, regulation or a generally accepted industry requirement, these issues impact the weight of the evidence rather than its admissibility. See Lee, 537 N.W.2d at 254 (while accreditation is desirable, its absence "should not bar the admissibility of PCR-produced DNA evidence"); Russell, 882 P.2d at 766 (same); Moore, 885 P.2d at 475 (contamination can occur with virtually any physical evidence, and thus goes to weight, not admissibility); Lyons, 924 P.2d at 813. ("The potential for contamination may present an open field for cross-examination or may be addressed through the testimony of defense experts ... [but] does not mean that the PCR method itself is inappropriate for forensic use."); Commonwealth v. Teixeira, 40 Mass.App.Ct. 236, 240, 662 N.E.2d 726 (1996), review denied, 422 Mass. 1107, 664 N.E.2d 1197 (weaknesses in laboratory's proficiency testing go to weight of DNA evidence, not its admissibility); Commonwealth v. Clark, Mass.Sup.Ct. slip op. (May 1996) (Brady, Patrick J.) (unpublished) ("The concept of error rate does not yet have such a firm, fixed meaning in the forensic testing community that the absence of a procedure to establish it is a fatal flaw to the admission *421 of the evidence.") See also 1996 NRC Rep. at 6-12 ("Most courts have decided that [criticisms about contamination potential of forensic PCR analysis] are pertinent to assessing the weight of the evidence but do not warrant the wholesale exclusion of PCR-based tests.") (footnotes omitted).
In sum, the Court is satisfied that the Government has established the scientific validity, and thus the evidentiary reliability, of the tests and techniques employed in the PCR-based DNA analysis conducted in this case, as required by Daubert.
ORDER
Based on the foregoing reasons, it is hereby ORDERED:
1. Lowe's motion to exclude the results of RFLP DNA analysis conducted by the FBI in this case (Docket No. 104) is DENIED.
2. Lowe's motion to exclude the results of PCR DNA testing conducted by the FBI in this case (Docket No. 117) is DENIED.
*422 APPENDIX A
*423 APPENDIX B
*424 APPENDIX C
NOTES
[1] The three PCR-based tests employed in this case were the Polymarker, D1S80, and the DQ-alpha test, referred to here as DQA1.
[2] As it turns out, the government never offered the defendant's admission as evidence, and the defendant did not testify.
[3] The samples were split so that the defense would have an opportunity to subject them independently to forensic DNA testing. Lowe has declined to inform the Court of the results, if any, of his independent analysis.
[4] This second report is designed to address certain questions left unanswered and controversies generated by the first NRC report. See National Research Council, DNA Technology in Forensic Science (1992).
[5] In this case, the FBI used six probes. Giusti Aff. ¶ 89.
[6] Each of the first three contain DNA profiles of 500 to 700 samples. The Native American database contains approximately 200 samples. Giusti Aff. ¶ 73.
[7] For example, if the probabilities of finding alleles A, B and C in the population is 1 in 10, 1 in 5, and 1 and 20, respectively, the product rule principle dictates that only 1 in 1000 persons will match all 3 (1/10 × 1/5 × 1/20 = 1/1000). See State v. Cauthron, 120 Wash.2d 879, 846 P.2d 502, 513 (1993) (en banc).
[8] Although the confidence interval was a battle-ground in the Daubert hearing, defendant did not cross-examine the expert on the 10 × factor, the ceiling principle or the artificial cap the FBI testified it used in its statistical analysis at trial. At side bar, the Court specifically flagged the failure of the defendant to object when Mr. Giusti did not testify as to the 10 × factor. Nonetheless, the testimony of the government expert went in without objection on this point. As the primary theory of defense was consent, not identity, this strategic decision made sense. Indeed, a challenge to the identity of the alleged assailant was not a feasible theory of defense as defendant was arrested with K.'s jewelry on. The Court refused to exclude the evidence pursuant to Fed.R.Evid. 403 as it was probative not only on the issue of identity but also on the location of the alleged rape. Its probative value was not substantially outweighed by the danger of unfair prejudice, confusion of the issues, or misleading the jury.
[9] Amplification occurs through the following three-step process:
First, each double-stranded segment is separated into two strands by heating. Second, these single-stranded segments are hybridized with primers, short DNA segments (20-30 nucleotides [i.e., base pairs] in length) that complement and define the target sequence to be amplified. Third, in the presence of the enzyme DNA polymerase, and the four nucleotide building blocks (A, C, G, and T), each primer serves as the starting point for the replication of the target sequence. A copy of the complement of each of the separated strands is made, so that there are two double-stranded DNA segments. This three-step cycle is repeated, usually 20-35 times. The two strands produce four copies; the four, eight copies; and so on until the number of copies of the original DNA is enormous.
1996 NRC Rep. at 2-11; Giusti Aff. ¶ 150. See also Spencer v. Commonwealth, 240 Va. 78, 393 S.E.2d 609, 620, cert. denied, 498 U.S. 908, 111 S.Ct. 281, 112 L.Ed.2d 235 (1990) (detailing PCR process).
[10] The number of donors in the databases assembled for PCR analysis are smaller than those created for RFLP. The RFLP databases are based upon data from between 500 to 700 persons, with the exception of the Native American database, which rests on data from some 200 persons. [Giusti Aff. ¶ 73]. In contrast, the databases for DQA1 and PM consist of data from 298 Caucasians, 338 Blacks, 164 Southwestern Hispanics, and 265 Southeastern Hispanics; databases for D1S80 consist of 718 Caucasians, 606 Blacks, 247 Southeastern Hispanics and 162 Southwestern Hispanics. [Id. ¶ 155]. Lowe suggested at one point during the hearing that the PCR databases were too small for frequency estimates premised upon them to be reliable. However, the government's experts testified, and Lowe's expert did not dispute, that a database of several hundred individuals is sufficient. See 1996 NRC Rep. at 0-20. The Polymarker and D1S80 databases for the Caucasian population were culled from 298 and 718 persons, respectively.
[11] Rule 702 provides:
If scientific, technical, or other specialized knowledge will assist the trier of fact to understand the evidence or to determine a fact in issue, a witness qualified as an expert by knowledge, skill, experience, training, or education, may testify thereto in the form of an opinion or otherwise.
[12] This peer-reviewed publication concerned just chemiluminescence, not the entire new protocol, which includes the other three changes.
[13] Dr. Krane examined the results of both open and blind proficiency tests performed by Mr. Giusti, as well as the data from the Headquarters study on the 113 casework samples.
[14] The Army study did not conclude that an expanded bin search window was necessary because it did not demonstrate any statistically significant differences in the measurements obtained with chemiluminescence. Mr. Giusti testified during the hearing that the expanded window was a conservative measure that might not prove necessary in the long run.
[15] After discovery, Lowe does not argue, and elicited no testimony tending to suggest, that an error was committed in the testing procedures conducted in this case.
[16] Defendant did introduce in camera documents regarding alleged deficiencies in the proficiency testing at the time the population databases were complied. I find that these hearsay allegations were insufficient to undercut the reliability of the population studies.
[17] The known DNA sample (K562), run as a control along with every forensic analysis performed in the DNA Unit, was designed to serve the purpose of catching errors if and when they occur during an actual forensic test.
[18] He does not challenge the DQA1 test on this ground.
[19] The results of both the Polymarker and D1S80 tests administered in connection with several criminal cases consolidated for evidentiary hearings were recently rejected by a comprehensive unpublished opinion of Massachusetts Superior Court Judge Moriarty, who relied on the testimony of Dr. Budowle, the FBI witness, on September 18, 1995, that the Polymarker technique, although reliable, had only been in use for about a year, and that the D1S80 test had just been started for casework in the FBI. The ruling is now being considered by the Supreme Judicial Court on interlocutory appeal by the Commonwealth. See Commonwealth v. Sok, Mass.Sup.Ct. No. 92-10979 at 194. At the hearings before me which concluded on September 17, 1996, the FBI had the benefit of an additional year of use of these PCR testing systems.
[20] "AMP-FLP" is the designation for the PCR system test on fragment length polymorphisms because the VNTR at the particular locus is amenable to amplification and analysis. The FBI has tested and rejected the AMP-FLP test on another locus, "APOB." See generally United States v. Thomas, 43 M.J. at 633 (allowing the results of the PCP-AMP-FLP test in the APOB region conducted by a German laboratory).
[21] These include:
(1) Known ("K") samples are always processed separately from unknown ("Q") samples;
(2) DNA is extracted in a separate biological containment room that is isolated from the area where samples are stored;
(3) Amplification of extracted samples is performed in yet another biologically contained lab room separated from the extraction facility;
(4) Lab technicians are prohibited from working in both rooms on the same day;
(5) Every sample is processed together with a known control sample, labelled K562, the results of which should be the same in every test run;
(6) Every sample is also processed together with a "blank" sample, containing no DNA, which will pick up and display any DNA contamination present.
If any indication of contamination appears in either the blank or K562 samples run in a particular test, Mr. Giusti testified that FBI protocol requires that the results be discarded.
[22] Until recently the FBI's Quality Control Unit, a separate division from the DNA Unit, administered and evaluated the proficiency testing for the lab. Now the FBI contracts with Cellmark to conduct proficiency testing and evaluate the results for the DNA Unit.
[23] Because the TWGDAM guidelines have not been submitted as an exhibit, the Court does not have the precise wording.